We provide IT Staff Augmentation Services!

Sr. Data Scientist / Machine Learning Resume

3.00/5 (Submit Your Rating)

SUMMARY

  • 6+ years’ experience as Data Science Automated Solutions along with Machine learning, Data Analyzing, Data Modeling and data warehouse management.
  • Good experience developing Machine Learning Algorithms both supervised and unsupervised, such as Random Forests, Support Vector Machines, CHAID, C5, Linear and Logistic Regression, K - Means Clustering, Anomaly Detection, Principal Component Analysis, Ensemble Algorithms and Artificial Neural Networks using canvas-based applications such as SPSS Modeler and Orange.
  • Good experience developing Machine Learning Algorithms with Python and R such as AutoML, PyCaret, Sklearn and AutoTS libraries.
  • Focused on Data Science and Machine Learning with experience building A.I models to address real-world problems, web-based solutions, and computational journalism research.
  • Expertise and knowledge in Tensor Flow to do machine learning/deep learning package in python.
  • Good knowledge on Microsoft Azure SQL, Machine Learning and HD Insight.
  • Good Exposure on SAS analytics.
  • Good exposure in creating pivot tables and charts in Excel.
  • Experience in developing Custom Report and different types of Tabular Reports, Matrix Reports, Ad hoc reports and distributed reports in multiple formats using SQL Server Reporting Services (SSRS).
  • Excellent Database administration (DBA) skills including user authorizations, Database creation, Tables, indexes and backup creation.
  • Experience in Data modeling techniques like star and snowflake schemas using Erwin tool.
  • Experience in working on SSIS, SSAS & SSRS project packages.
  • Worked with Logical and physical modeling along with design and implementation.
  • Involved in job scheduling, monitoring and production support in a 24/7 environment
  • Having knowledge on working with Data warehouse concepts.
  • In-depth awareness of unique business unit needs and business environments, and how to deploy digital technology to create meaningful improvement transformations.
  • Built a Machine Learning algorithm to detect breast cancer using unstructured images data. Pre-processed and trained the model using 25k images to gain an accuracy of 87% with a recall value of 88%. Tools Python, TensorFlow, Computer Vision, OpenCV, Keras.
  • Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing data mining and reporting solutions that scales across massive volume of structured and unstructured data.
  • Experienced in designing customized interactive dashboards in Tableau using Marks, Action, Filters, Parameter and Calculations.
  • Collaborative style with ability to develop effective relationships with stakeholders to produce business value.

TECHNICAL SKILLS

Programming Languages: Python, R, SQL

Machine Learning: Scikit-Learn, NumPy, Pandas, Vaex, Matplotlib, Seaborn; Supervised Learning: Linear Regression, Logistic Regression, K-Nearest Neighbour, Naïve Bayes, Decision Trees, Random Forest, Support Vector Machine (SVM); Unsupervised Learning: K-means Clustering, Principal Component Analysis (PCA), Feature Engineering and Feature Selection

Natural Language Processing: Spacy, NLTK, Data Mining & Data Pre-processing, Vectorization, Sentiment & Semantics Analysis, Text Classification, Topic Modeling, Word2Vec

Computer Vision: OpenCV, Image Preprocessing, Object Detection

Deep Learning: TensorFlow, Keras, CNN, RNN, GAN, AutoEncoder

Time Series Forecasting & Analysis: Statsmodels, ARIMA, SARIMA

Web Development: HTML/CSS, JS, Flask, REST API

Cloud Computing: AWS, Google Cloud Platform, Azure

Data Analysis/Visualization: Tableau, PowerBI

Tools: MS Office, LaTeX, Adobe Photoshop

PROFESSIONAL EXPERIENCE

Confidential

Sr. Data Scientist / Machine Learning

Responsibilities:

  • Build efficient, less laborious scripts for data cleaning and resampling, which improved the model performance.
  • Generated trade ideas for convertible bond portfolio based on quantitative analysis and fundamental research.
  • Strong background in statistics and mathematical optimization, forecasting, and simulation, etc.
  • Statistics Consulting Center, provided statistical advice and analysis for industry in California as well as members of the campus community
  • Assisted in statistics analysis on business protocol and report.
  • Helped machine-learning engineering to get-started with Docker containers for deploying machine learning models.
  • Implemented cost effective and cutting-edge Data Science/Big Data Lab where I can easily prototype using the most advanced tools such as Jupiter Notebook R, No SQL, R Studio, GPUs/CPUs.
  • Technical lead for initiative to develop a machine learning image classification application to detect early skin cancer and other common skin diseases with 96% accuracy. Started with 20 U.S. common skin diseases and expanding to 50 common skin diseases. Collects, labels, and pre-processes data, builds, and deploys image classification model. Improves performance and updates skin diseases dataset. Leads two developers to build out the application. Tools: Python, Computer Vision, Machine Learning, Deep Learning, OpenCV, TensorFlow.
  • Created an application to detect eye diseases Computer Vision. Achieved an accuracy of 94%.
  • Collected the data, pre-processed the data and build an image classification model. Tools: Python, Computer Vision, Machine Learning, Deep Learning, OpenCV, TensorFlow.
  • Developed and implemented an automatic help desk using text classification to group customer requests using natural language processing (NLP).
  • Defined client requirements, collected, labelled, and pre-processed data, developed, tested, and deployed data. Tools: Python, Natural Language Processing, Machine Learning, Deep Learning, NLTK.
  • Develop an AI power chatbot for 24/7 HR benefits access, such as W2, Paystabs, and Apply PTO etc. Tools: Python, Deep Learning, GoogleBert, DialogFlow, TensorFlow, Natural Language Processing.
  • Create analytical and forecasting reports to understand company’s financial trends and gain valuable financial insights. Tools: Python, Statsmodels, NumPy, Pandas, Matplotlib, Seaborn, Plotly, ARIMA, SARIMA.
  • Manages team and leads concept/design development for a smart HR assistant chatbot to streamline HR communications using Google Dialogflow.
  • Generated dashboards with multiple data sources (Excel, Oracle, SQL Server, Azure SQL Data warehouse, etc.) using tools like Tableau, Power BI to visualize the data analysis.
  • Experience in defining the scope of the project across Data Science, Data Analytics projects in collaboration with management and clients.
  • Established an outsourced development team in India and eliminated unnecessary technology to reduce costs by 75%.
  • Packaged code and required resources into a single deployable package using Shell-Script, Docker Containers, CloudFormation, Python Scripts
  • Participated in multi team’s effort to build an energy-specific language model like BERT from google using Attention models.

Environment: Windows, Python 3, Azure Data Factory (ADF), AWS SageMaker Studio, Snowflake, Tableau, R Shiny, Keras, OpenCV, PyTorch, IoT, Azure Devops, Pandas, Computer Vision, Selenium, Machine learning algorithms.

Confidential

Data Scientist / ML Engineer

Responsibilities:

  • Developed a Machine Learning customer churn predicting application for company’s internal use for the business and marketing team. Tools: Python, Machine Learning, NumPy, Pandas.
  • Developed customer fraud detection application using Machine Leaning. Used over 5k claims data. Tools: Python, Machine Learning.
  • Participate in the complete development of cloud-based data science application, from exploratory data gathering, processing, and providing a set of results and visualizing them to maximize customer experience.
  • Created analytical reports to understand customer insurance claims. Tools: Python, NumPy, Pandas, Matplotlib, Seaborn.
  • Participate in the complete development of cloud-based data science application, from exploratory data gathering, processing, and providing a set of results and visualizing them to maximize client experience.
  • Developed a data dashboard for clients to view and analyze their claim data. Tools: Python, NumPy, Pandas, Marplotlib, Ggplot2.
  • Built ML models centered on Backorder Prediction, one to predict backorders at the country SKU level and one to predict warehouse availability stock using AutoML for an XGBoost algorithm for Backorder prediction at the country level and Pycaret for the Warehouse availability.
  • Developed the end-to-end solution for the aforementioned ML models; starting with data file extractions from SQL, then SAP BW and finally migrating them to AWS Foresight connector to streamline the process, running the model inside AWS and warehousing the results in AWS.
  • Worked on predicting country demand for Contact Lenses based on lens power since Contact Lenses are sold in packages and there are certain lens power used more than others.
  • Created BI dashboard to detail the bell curve for Contact Lens power, undertaking the entire end-to-end solution. (POC, planning, development, testing and validation, and deployment)
  • Developed Qlik Sense dashboards for monitoring demand at a year-to-year level with different KPI´s.
  • Experienced in writing complex SQL queries with joins, and few users defined functions and store procedures to maintain the different aspects of the application.
  • Designed, coded, and modified company website.

Environment: Python 2.x, CDH5, HDFS, Hadoop 2.3, Hive, Impala, AWS, Linux, Spark, Tableau Desktop, SQL Server 2014, Microsoft Excel, Apache Airflow, SSIS, Matlab, Spark SQL, Azure, Pyspark, Jupyter Notebook.

Confidential

Data Analyst

Responsibilities:

  • Developed dashboards and reports using BI tools like Qlik View and Tableau in the Insurance domain. Determined company performance on claims and premiums.
  • Team Lead on dashboard development using technologies like D3js, Jquery and JavaScript to identify the data flow models of AEL Solvency II to improve the strategic models.
  • Performed Data Deep Dive Analysis on datasets across the platforms like Oracle, My SQL, and Excel by performing complex SQL queries, Lookups respectively to identify data points, data features and KPIs to support delivery of quality information through dashboards.
  • Integration of database tools including Oracle and My SQL with BI tools, such as Qlik View and Tableau, to perform data extraction and manipulation by using VB scripts and SQL
  • Developed and VBA macro to improve and automate the process of data entry to increase the efficiency of the data updates.
  • Worked on creating a new relational database centered on Customer Insights by extracting data, working with the end user and helping develop the ETL workflows.
  • Migrated operational financial dashboards from Tableau to MicroStrategy; doing the design, translation, testing and validation.
  • During the start of the pandemic, we had to restructure our entire database with new rules to help our customers weather the storm. This meant an IT wide push to change ETL logic, change database information, test, and validate the results. I learned many validating and testing data which I have applied to all my subsequent projects
  • Engaged with stakeholders in project progress and status reports.
  • Engaged in requirements gathering, project planning, data modeling, development, unit testing, deployment and delivery of the project.

Environment: Python, SPSS Modeler and Statistics, Jupyter, IBM Deployment Manager, CIB, Qlik Sense, Tableau, MicroStrategy, Red Hat Linux, Microsoft Server, Social Bakers, Google Analytics, IBM DataStage

We'd love your feedback!