We provide IT Staff Augmentation Services!

Lead Data Scientist Resume

4.00/5 (Submit Your Rating)

New York, NY

SUMMARY

  • Senior Data Scientist with over 25 years of IT experience, out of which over 12 in Machine Learning and Artificial Intelligence. Ten years of Financial/Banking and eight years of Insurance domain experience.

TECHNICAL SKILLS

AI: Machine Learning (supervised and unsupervised), Natural Language Processing.

Tools: Dataiku, Azure Machine Learning Studio, PowerDesigner, Enterprise Architect

Methodologies: Scrum, Agile Development, Rational Unified Process (RUP), SDLC.

Finance: Equities, Derivatives, Portfolio Management, Asset Allocation, Risk Management, Fixed Income Analytics, Trading Systems, MBO, CMO, Market Risk, Credit Risk, Liquidity Risk, Stocks, Option Strategies, Financial Regulatory Compliance.

Compliance: SEC, FINRA, AML, KYC, Dodd - Frank, CCAR, Volcker, Fed, Basel III, RegYY.

Languages: Python, C++, R, UML, Java, SQL, XML, HTML, Smalltalk, VB, Lisp, Prolog, IDL.

Other: Windows, Unix, J2EE, Oracle, SQL Server, JIRA, HP ALM, BitBucket, Confluence.

PROFESSIONAL EXPERIENCE

Confidential, New York, NY

Lead Data Scientist

Responsibilities:

  • Created Dataiku Machine Learning models and predicted quarterly revenue of oil drilling companies with 3% accuracy (beating 5% accuracy of Wall Street analysts).
  • Using ML predicted major economic indicators based on detailed credit data from Equifax.
  • Loaded, cleaned, and validated datasets that were sold to Hedge Funds for predicting equity prices (crude oil production, prices of products distributed by largest US retailers, credit report data).
  • Analyzed sensitive data using Python and LeapYear Differential Privacy software
  • Predicted pharma trends based on prescription drugs and medical claims data.
  • Made predictions based on credit and delinquency data
  • Using Time Series Analysis predicted seasonality and production volume of natural gas.
  • Evaluated various AI/ML tools (MS Azure Machine Learning Studio, Dataiku).
  • Designed ML/NLP approach for extracting item’s price and manufacturer from web pages.
  • Created Neural Networks using Tensorflow and Keras.
  • Designed and created tables in SQL Server database.

Technology: Data Science, Machine Learning, Artificial Intelligence, Python, SQL, Dataiku, Azure Machine Learning Studio, Natural Language Processing (NLP), Neural Networks, Jupyter, Differential Privacy, Keras

Confidential, New York, NY

Lead Data Scientist

Responsibilities:

  • Gathered requirement and designed Cognitive Automation system that included:
  • Machine learning wrapper on top of Supervised and Unsupervised models used for classification. The wrapper implemented auto-tuning of hyper-parameters using Grid Search and Randomized Search), and sampling (k-folding).
  • Implementation of OLS, Ridge, LASSO, and Elastic Net Regression algorithms.
  • Recommendation engine using Apriori (based on Association Rules Learning) and Collaborative Filtering ALS (Alternating Least Square) approaches.
  • Text Analytics Engine for Chatbots, Virtual Assistants, and Sentiment Analysis
  • Time Series Analysis using ARIMA, Auto ARIMA, and LSTM Models.
  • Image recognition using neural network (CNN) implemented in Keras and TensorFlow
  • As a Team Lead coordinated work of 25 team members in 4 locations
  • Implemented in Python a Word Feature Engine - the Natural Language Processing Engine providing processing, analyzing and manipulating text data. Used Topic Modelling using LDA (Latent Dirichlet Allocation), LSA (Latent Semantic Analysis), and SVD. Implemented word embedding models (Word2Vec and GloVe) using gensim library.
  • Analyzed the results of a document classification system that was using TF-IDF, N-gram modelling, stemming and lemmatization. Created Confusion Matrix and statistical analysis of the results. Developed enhanced machine learning algorithm. Applied feature engineering that improved accuracy of the classification algorithms.
  • Documented requirements for all systems, created UAT test cases and created traceability matrix

Technology: Machine Learning, Artificial Intelligence, Python, Hadoop, Bigdata, MapReduce, Natural Language Processing (NLP), Natural Language Generation (NLG), Image Processing, Pattern Recognition, Neural Networks, TensorFlow, Keras, scikit-learn library, TF-IDF, Jupyter.

We'd love your feedback!