Data Scientist/data Analyst Resume
5.00/5 (Submit Your Rating)
SUMMARY
- Data Scientist/Data Analyst with over 8 years of experience in Data Science and Analytics including Artificial Intelligence/Deep Learning/Machine Learning(NLP), Data Mining and Statistical Analysis
- Involved in the entire data science project life cycle and actively involved in all the phases including data extraction, data cleaning, statistical modelling and data visualization with large data sets of structured and unstructured data, created ER diagrams and schema.
- Experienced with machine learning algorithm such as logistic regression, random forest, XGboost, KNN, SVM, NLP, neural network, linear regression, lasso regression and k - means
- Implemented Bagging and Boosting to enhance the model performance.
- Strong skills in statistical methodologies such as A/B test, experiment design, hypothesis test, ANOVA
- Extensively worked on Python 3.5/2.7 (Numpy, Pandas, Matplotlib, NLTK and Scikit-learn)
- Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0 Jupiter Notebook 4.X, R 3.0 (ggplot2, Caret, dplyr) and Excel
- Experience in designing star schema, Snowflake schema for Data Warehouse, OD architecture.
- Solid ability to write and optimize diverse SQL queries, working knowledge of RDBMS like SQL Server 2008, NoSql databases like Mongo DB 3.2
- Developed API libraries and coded business logic using C#, XML and designed web pages using .NET framework, C#, Python, Django, HTML, AJAX.
- Hands on experience on Data Analytics Services such as Athena, Glue Data Catalog & Quick Sight.
- Very good experience and knowledge in provisioning virtual clusters under AWS cloud which includes services like EC2, S3, and EMR
- Experience in coding SQL/PL SQL using Procedures, Triggers and Packages.
- Experience in visualization tools like, Tableau 9.X, 10.X for creating dashboards
- Excellent understanding Agile and Scrum development methodology
- Used the version control tools like Git 2.X and build tools like Apache Maven/
PROFESSIONAL EXPERIENCE
Confidential
DATA SCIENTIST/DATA ANALYST
Responsibilities:
- Analyse and Prepare data, identify the patterns on dataset by applying historical models. Collaborating with Senior Data Scientists for understanding of data
- Perform data manipulation, data preparation, normalization, and predictive modelling. Improve efficiency and accuracy by evaluating model in Python and R
- This project was focused on customer segmentation based on machine learning and statistical modelling effort including building predictive models and generate data products to support customer segmentation Used Python and R for programming for improvement of model. Upgrade the entire models for improvement of the product
- Was responsible for creating on - demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.
- Designed and implemented recommender systems which utilized Collaborative filtering techniques to recommend course for different customers and deployed to AWS EMR cluster.
- Automated solutions to manual processes with big data tools (Spark, Python, AWS).
- Involved in Migrating Objects from Teradata to Snowflake.
- Developed and deployed to production multiple projects in the CI/CD pipeline for real-time data distribution, storage and analytics.
- Built price elasticity model for various product and services bundled offering
- Under supervision of Sr. Data Scientist performed Data Transformation method for Re scaling and Normalizing Variables
- Developed predictive causal model using annual failure rate and standard cost basis for the new bundled service offering.
- CI/CD automation and orchestration of deployment to various environments using Gitlab, CI/CD Pipelines
- Used AWS Glue for the data transformation, validate and data cleansing
Confidential
DATA SCIENTIST/ DATA ANALYST
Responsibilities:
- Used various approaches to collect the business requirements and worked with the business users for ETL application enhancements by conducting various Joint Requirements Development (JRD) sessions to meet the job requirements Performed exploratory data analysis like calculation of descriptive statistics, detection of outliers, assumptions testing, factor analysis, etc., in Python and R
- Maintain build profiles in Team Foundation Server and Jenkins for CI/CD pipeline
- Build models based on domain knowledge and customer business objectives
- Used AWS glue catalog with crawler to get the data from S3 and perform sql query operations
- Extracted data from the database using Excel/Access, SQL procedures and created Python and R datasets for statistical analysis, validation and documentation
- Extensively understanding BI, analytics focusing on consumer and customer space
- Innovate and leverage machine learning, data mining and statistical techniques to create new, scalable solutions for business problems
- Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating a serverless data pipeline which can be written to Glue Catalog and can be queried from Athena.
- Performed Data Profiling to assess data quality using SQL through complex internal database
- Improved sales and logistic data quality by data cleaning using Numpy, Scipy, Pandas in Python
- Designed data profiles for processing, including running SQL, Procedural/SQL queries and using Python and R for Data Acquisition and Data Integrity which consists of Datasets Comparing and Dataset schema checks
Confidential
Data Analyst
Responsibilities:
- Involved in complete Software Development Life Cycle (SDLC) process by analyzing business requirements and understanding the functional work flow of information from source systems to destination systems
- A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL,, Unix Commands, NoSQL, Hadoop
- Used pandas, numpy, seaborn, scipy, matplotlib, scikit - learn, NLTK in Python for developing various machine learning algorithms
- Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python
- Analyzed sentimental data and detecting trend in customer usage and other services
- Analyzed and Prepared data, identify the patterns on dataset by applying historical models
- Collaborated with Senior Data Scientists for understanding of data
- Used Python and R scripting by implementing NLP machine algorithms to predict the data and forecast the data for better results
- Used Python and R scripting to visualize the data and implemented machine learning algorithms
- Experience in developing packages in R with a shiny interface
- Used predictive analysis to create models of customer behavior that are correlated positively with historical data and use these models to forecast future results
- Resolved Tableau Athena/Composite DB data extract refresh problem; updated all data sources in a day, and re-published workbook on the server; created dashboards on new data
- Perform data manipulation, data preparation, normalization, and predictive modeling
- Improve efficiency and accuracy by evaluating model in Python and R
- Used Python and R script for improvement of model Application of various machine learning algorithms and statistical modeling like Decision Trees, Random Forest, Regression Models, neural networks, SVM, clustering to identify Volume using scikit-learn package
Environment: R/R studio, Python, Tableau, Hadoop, Hive, MS SQL Server, MS Access, MS Excel, Outlook, Power BI.
Confidential
Data Analyst
Responsibilities:
- Collaborate with Product Management and Business analyst to collect detailed functional specification for Health Systems.
- Understand the requirements for Pharmacy, Claims and Patient history modules, document requirements and work with Product manager to sign off the requirement
- Data analysis of complex business data received from Business
- Worked with database leads to design the database objects and finalize the referential integrity, primary keys, unique constraints and defaults for the columns. Also work with them to finalize the table structures
- Involved in logical modeling and physical database design with Data modelers
- Responsible for creating database objects like Table, Store Procedure, Triggers, Functions, views, materialized views using T - SQL to provide structure to store data and to maintain database efficiently
- Developed unit test cases, worked on data checking and testing activities.
- Responsible for loading and maintaining tables used by different development teams in their day to day jobs and updating it daily according to the changes happening on the business end
- Involved in Performance tuning for sources, targets, understanding locking and deadlocks in transactions for better performance of SQL server
- Worked with testers to solve the defects in different modules
- Deploying the code into production server after user acceptance testing Analysis of functional data elements for data profiling and mapping from source to target data environment
- Worked on creation of source to target (S2T) mapping documents
- Performed assigned data analysis and data validations for general ledger module Effectively utilized SSMS to run SQL / T-SQL statements on the database
- Worked with users to define business requirements and analytical needs, identifies and recommends potential data sources, compiles/mines data from a variety source