DATA SCIENTIST/DATA ANALYST Resume

SUMMARY

Data Scientist/Data Analyst with over 8 years of experience in Data Science and Analytics including Artificial Intelligence/Deep Learning/Machine Learning(NLP), Data Mining and Statistical Analysis
Involved in the entire data science project life cycle and actively involved in all the phases including data extraction, data cleaning, statistical modelling and data visualization with large data sets of structured and unstructured data, created ER diagrams and schema.
Experienced with machine learning algorithm such as logistic regression, random forest, XGboost, KNN, SVM, NLP, neural network, linear regression, lasso regression and k - means
Implemented Bagging and Boosting to enhance the model performance.
Strong skills in statistical methodologies such as A/B test, experiment design, hypothesis test, ANOVA
Extensively worked on Python 3.5/2.7 (Numpy, Pandas, Matplotlib, NLTK and Scikit-learn)
Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0 Jupiter Notebook 4.X, R 3.0 (ggplot2, Caret, dplyr) and Excel
Experience in designing star schema, Snowflake schema for Data Warehouse, OD architecture.
Solid ability to write and optimize diverse SQL queries, working knowledge of RDBMS like SQL Server 2008, NoSql databases like Mongo DB 3.2
Developed API libraries and coded business logic using C#, XML and designed web pages using .NET framework, C#, Python, Django, HTML, AJAX.
Hands on experience on Data Analytics Services such as Athena, Glue Data Catalog & Quick Sight.
Very good experience and knowledge in provisioning virtual clusters under AWS cloud which includes services like EC2, S3, and EMR
Experience in coding SQL/PL SQL using Procedures, Triggers and Packages.
Experience in visualization tools like, Tableau 9.X, 10.X for creating dashboards
Excellent understanding Agile and Scrum development methodology
Used the version control tools like Git 2.X and build tools like Apache Maven/

PROFESSIONAL EXPERIENCE

Confidential

DATA SCIENTIST/DATA ANALYST

Responsibilities:

Analyse and Prepare data, identify the patterns on dataset by applying historical models. Collaborating with Senior Data Scientists for understanding of data
Perform data manipulation, data preparation, normalization, and predictive modelling. Improve efficiency and accuracy by evaluating model in Python and R
This project was focused on customer segmentation based on machine learning and statistical modelling effort including building predictive models and generate data products to support customer segmentation Used Python and R for programming for improvement of model. Upgrade the entire models for improvement of the product
Was responsible for creating on - demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.
Designed and implemented recommender systems which utilized Collaborative filtering techniques to recommend course for different customers and deployed to AWS EMR cluster.
Automated solutions to manual processes with big data tools (Spark, Python, AWS).
Involved in Migrating Objects from Teradata to Snowflake.
Developed and deployed to production multiple projects in the CI/CD pipeline for real-time data distribution, storage and analytics.
Built price elasticity model for various product and services bundled offering
Under supervision of Sr. Data Scientist performed Data Transformation method for Re scaling and Normalizing Variables
Developed predictive causal model using annual failure rate and standard cost basis for the new bundled service offering.
CI/CD automation and orchestration of deployment to various environments using Gitlab, CI/CD Pipelines
Used AWS Glue for the data transformation, validate and data cleansing

Confidential

DATA SCIENTIST/ DATA ANALYST

Responsibilities:

Used various approaches to collect the business requirements and worked with the business users for ETL application enhancements by conducting various Joint Requirements Development (JRD) sessions to meet the job requirements Performed exploratory data analysis like calculation of descriptive statistics, detection of outliers, assumptions testing, factor analysis, etc., in Python and R
Maintain build profiles in Team Foundation Server and Jenkins for CI/CD pipeline
Build models based on domain knowledge and customer business objectives
Used AWS glue catalog with crawler to get the data from S3 and perform sql query operations
Extracted data from the database using Excel/Access, SQL procedures and created Python and R datasets for statistical analysis, validation and documentation
Extensively understanding BI, analytics focusing on consumer and customer space
Innovate and leverage machine learning, data mining and statistical techniques to create new, scalable solutions for business problems
Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating a serverless data pipeline which can be written to Glue Catalog and can be queried from Athena.
Performed Data Profiling to assess data quality using SQL through complex internal database
Improved sales and logistic data quality by data cleaning using Numpy, Scipy, Pandas in Python
Designed data profiles for processing, including running SQL, Procedural/SQL queries and using Python and R for Data Acquisition and Data Integrity which consists of Datasets Comparing and Dataset schema checks

Confidential

Data Analyst

Responsibilities:

Involved in complete Software Development Life Cycle (SDLC) process by analyzing business requirements and understanding the functional work flow of information from source systems to destination systems
A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL,, Unix Commands, NoSQL, Hadoop
Used pandas, numpy, seaborn, scipy, matplotlib, scikit - learn, NLTK in Python for developing various machine learning algorithms
Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python
Analyzed sentimental data and detecting trend in customer usage and other services
Analyzed and Prepared data, identify the patterns on dataset by applying historical models
Collaborated with Senior Data Scientists for understanding of data
Used Python and R scripting by implementing NLP machine algorithms to predict the data and forecast the data for better results
Used Python and R scripting to visualize the data and implemented machine learning algorithms
Experience in developing packages in R with a shiny interface
Used predictive analysis to create models of customer behavior that are correlated positively with historical data and use these models to forecast future results
Resolved Tableau Athena/Composite DB data extract refresh problem; updated all data sources in a day, and re-published workbook on the server; created dashboards on new data
Perform data manipulation, data preparation, normalization, and predictive modeling
Improve efficiency and accuracy by evaluating model in Python and R
Used Python and R script for improvement of model Application of various machine learning algorithms and statistical modeling like Decision Trees, Random Forest, Regression Models, neural networks, SVM, clustering to identify Volume using scikit-learn package

Environment: R/R studio, Python, Tableau, Hadoop, Hive, MS SQL Server, MS Access, MS Excel, Outlook, Power BI.

Confidential

Data Analyst

Responsibilities:

Collaborate with Product Management and Business analyst to collect detailed functional specification for Health Systems.
Understand the requirements for Pharmacy, Claims and Patient history modules, document requirements and work with Product manager to sign off the requirement
Data analysis of complex business data received from Business
Worked with database leads to design the database objects and finalize the referential integrity, primary keys, unique constraints and defaults for the columns. Also work with them to finalize the table structures
Involved in logical modeling and physical database design with Data modelers
Responsible for creating database objects like Table, Store Procedure, Triggers, Functions, views, materialized views using T - SQL to provide structure to store data and to maintain database efficiently
Developed unit test cases, worked on data checking and testing activities.
Responsible for loading and maintaining tables used by different development teams in their day to day jobs and updating it daily according to the changes happening on the business end
Involved in Performance tuning for sources, targets, understanding locking and deadlocks in transactions for better performance of SQL server
Worked with testers to solve the defects in different modules
Deploying the code into production server after user acceptance testing Analysis of functional data elements for data profiling and mapping from source to target data environment
Worked on creation of source to target (S2T) mapping documents
Performed assigned data analysis and data validations for general ledger module Effectively utilized SSMS to run SQL / T-SQL statements on the database
Worked with users to define business requirements and analytical needs, identifies and recommends potential data sources, compiles/mines data from a variety source

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship