Data Scientist Resume
Bellevue, WA
SUMMARY
- Over 5+ years of IT Experience in data base and data science, Machine Learning Algorithms, and Visualization.
- Extensive experience working in various domains like Telecom, Banking, and Automobiles.
- Experience in exploratory data analysis (EDA) using R language.
- Experience in writing code in R and Python to manipulate data for data loads, extracts, statistical analysis, and modeling.
- Experience in SAS
- Hands on Experience working on Amazon Redshift.
- Identified areas of improvement in existing business by unearthing insights by analyzing vast amount ofdatausing machine learning techniques
- Experience in using Decision Trees, k - Nearest Neighbors, Clustering (hierarchical and k-means), Genetic algorithm, Dijkstra algorithm.
- Designed and implemented statistical / predictive models utilizing diverse sources ofdatato predict demand, risk, and priceelasticity.
- Conducted in-depthanalysisand predictive modelling to uncover hidden opportunities; communicate insights to teh product, sales, and marketing teams.
- Experience in Hadoop, MapReduce, MongoDB, and HDFS
- Experience in creating different types of data visualization using R, Power BI, and Tableau.
- Experience in Installing, Upgrading and Configuring Microsoft SQL Server.
- Participated all stages in Agile Scrum methodologies of project management.
- Skilled Confidential assessing client needs, working in groups, suggesting ideas dat enhance efficiency and performance, implementing technology solutions, and training end users.
TECHNICAL SKILLS
Languages: R (4yrs), Python (4yrs), SAS (4yrs), T-SQL ( 6yrs), SQL (6yrs), HTML (2yrs), C(6yrs), C++(6yrs)
Tools: R Studio, Microsoft Azure, Enterprise Manager, MS SQL, SSRS, SSIS, SSAS, Business Intelligence Development Studio (BI), Visual Studio 2013, Oracle, MongoDB, Amazon Redshift
BigData Ecosystems: Hadoop(4yrs), HDFS, MapReduce
Reporting Tools: MS Office 2003/2007, SQL Server Reporting Services, Crystal Reports, Power BI, Tableau.
Statistical Techniques: Machine learning (3yrs), Decision Trees, k-Nearest Neighbors, Clustering (hierarchical and k-means), Genetic algorithm, Dijkstra algorithm.
PROFESSIONAL EXPERIENCE
Confidential, Bellevue, WA
Data Scientist
Responsibilities:
- dis project was focused on customer segmentation based on machine learning and statistical modeling effort including buildingpredictive models and generatedataproducts to support customer segmentation.
- Develop a pricing model for various product & services bundled offering to optimize and predict teh gross margin.
- Built priceelasticitymodel for various product and services bundled offering.
- Developed predictive causal model using annual failure rate and standard cost basis for teh new bundled service offering.
- Design and develop analytics, machine learning models, and visualizations dat drive performance and provide insights, from prototyping to production deployment and productrecommendation and allocation planning;
- Worked with sales and Marketing team for Partner and collaborate with a cross-functional team to frame and answer important data questions. prototyping and experimenting ML/DLalgorithms and integrating into production system for different business needs.
- Worked on Multiple datasets containing 2billion values which are structured and unstructured dataabout web applications usage and online customer surveys.
- Good hands on experience on Amazon Redshift platform.
- Design, built and deployed a set of python modeling APIs for customer analytics, which integrate multiplemachinelearningtechniques for various user behavior prediction and support multiple marketingsegmentation programs.
- Segmented teh customers based on demographics using K-means Clustering.
- Explored different regression and ensemble models in machine learning to perform forecasting.
- Used classification techniques including Random Forest and Logistic Regression to quantify teh likelihood of each user referring.
- Designed and implemented end-to-end systems forDataAnalytics and Automation, integrating custom visualization tools using R, Tableau, and Power BI.
Environment: MS SQL Server, R/R studio,SAS, Python, Redshift, MS Excel, Power BI, Tableau, T-SQL, ETL, MS Access, XML, MS office 2007, Outlook.
Confidential, Dallas, TX
Data Scientist
Responsibilities:
- Developed algorithm to analyze sentimentaldataand detecting trend in customer usage and other services.
- Worked on Multiple datasets containing 1billion values which are structured and unstructured data.
- Used predictive analysis to create models of customer behavior dat are correlated positively with historical data and use these models to forecast future results.
- Explored teh user API's and documented relevant API's for teh project and created a work-flowplan.
- Evaluated feature importance related to customer segmentation, preference, and prediction via Random Forest and Logistic Regression.
- Analyzed historicaldata, documentation, supporting documentation, screen prints, and email conversations.
- Implemented K-means and Clustering Algorithms to group users according to teh usage and requirements of teh services by area.
- Showed dynamic visualization of clustering of user’s coverage and network usage based on teh area by using R Studio.
- Predicted user preference based on segmentation using General Additive Models, combined with feature clustering, to understand non-linear patterns between user segmentation and related monthly platform usage features (time seriesdata).
- Conducted linear regression to predict teh transaction volume, and distinguished frequent claimers based on MapReduce using Hadoop and R studio.
Environment: R/R studio, Python, Tableau, Hadoop, MS SQL Server 2005/ 2008, MS Access, MS Excel, Outlook, Power BI.
Confidential, Oklahoma
Data Analyst
Responsibilities:
- Support teh Debt Collection Application specifically application related Payment Imports and skip tracing of teh consumer address information.
- Designed and implemented end-to-end systems forDataAnalytics and Automation, integrating custom visualization tools using R, Hadoop, MongoDB, Tableau, and Power BI.
- Responsible for creating Summary reports, Sub reports, Drill Down reports, Matrix reports.
- Developed Stored Procedures for parameterized, drill-down, and drill-through reports in SSRS.
- Formatted reports using Global Variables, Expressions, and Functions for teh reports.
- Created different graphical reports using DAX queries for better stimulation of data in Power BI.
- Created teh DTS Package through ETL Process to vendors in which records were extracts from Flat file and Excel sources and loaded daily Confidential teh server.
- Assess detailed specifications against design requirements.
- Responsibilities taken as a production support Developer for applications as on needed basis.
- Knowledge of financial instruments such as making payments to teh account after splitting them to teh accounts based on teh commission rates and teh bucket levels.
Environment: SQL Server 2000/2005 Enterprise Edition, SQL Enterprise manager, R/R studio, MS PowerPoint, MS Access 2000 & Windows 2003/2000 platform, DTS, SSIS, SSRS, Power BI.
Confidential
Data Analyst
Responsibilities:
- Analyze and Preparedata, identify teh patterns on dataset by applying historical models.
- Collaborating with SeniorDataScientists for understanding ofdata.
- Performdatamanipulation,datapreparation, normalization, and predictive modeling.
- Improve efficiency and accuracy by evaluating model in R.
- Present teh existing model to stockholders, give insights for model by using different visualization methods in Power BI.
- Used R and Python for programming for improvement of model.
- Upgrade teh entire models for improvement of teh product.
- PerformedDatacleaning process applied Backward - Forward filling methods on dataset for handling missing values.
- Under supervision of Sr.DataScientistperformedDataTransformation method for Rescaling and Normalizing Variables
- Developed a predictive model and validate Neural Network Classification model for predict teh feature label.
- Performed Boosting method on predicted model for teh improve efficiency of teh model
- Presented Dashboards to Higher Management for more Insights using Power BI.
Environment: R/R Studio, Python, SQL Enterprise Manager, Git Hub, Microsoft Power BI, outlook.