ML/Research Engineer Resume Franklin lakes, NJ - Hire IT People

SUMMARY

High performing expert with over 7 years of hands - on experience in data munging, machine learning, Artificial Intelligence and operations research to offer solid skills in the field of data science and big data analytics
Always passionate for new challenges and continuous pushing the limits of expertise
Experienced in working with Business Users/Product owners/Stakeholders to deliver BI solutions, Self-service reports, Statistical and predictive analysis
Managed multiple projects involving complex big data projects, machine learning, text mining, ETL, BI, data governance, data quality, security & compliances, industry standard best practices
Worked and extracted data from various database sources like Oracle and SQL Server
Experience in most Data scrubbing/data techniques - Feature selection, One-hot encoding, binning, Normalization, Standardization & handling Missing data
Deep understanding of statistical concepts & techniques like Probability, Likelihood, Hypothesis testing, A/B Testing, Interpreting p-values, t-tests, ANOVA, ARIMA
Unearthed the raw data by doing the Exploratory Data Analysis (Classification, Splitting, Cross-validation) by using Machine Learning packages like Pandas and NumPy in Python
Utilized various techniques like Histogram, Bar plot, Pie-Chart, Scatter plot, Box plots to determine the condition of the data using Matplotlib, Seaborn and plotly packages.
Good Knowledge of Big Data techniques like MapReduce, Hadoop, Hive, s, etc.
Data Ingestion and automation of the movement of data securely between disparate data sources and systems. Real-time dataflow management, streaming analytics, integrating data lakes and Data demand Planning.

TECHNICAL SKILLS

Data Science Toolbox: R, Python 2.7/3.0, SAS, MATLAB, Jupyter notebooks, Spyder, Visual Code, PyCharm, TensorFlow, Keras

Database: Oracle, MS Access, Microsoft SQL Server 2012/2014, Hive, MongoDB

Machine Learning: Classification, Clustering and Regression Techniques, Supervised and Unsupervised Learning, Time Series and Forecasting, Deep Learning (ANN, CNN, RNN, LSTM), Neural Networks, Deep Learning, NLP, NLTK, Image processing, Computer Vision

Data Modeling Tools: Erwin, ER Studio, Snowflake-Schema Modeling, FACT and Dimension Tables, Pivot Tables

BI Tools: Tableau 10.5, Power BI, Kibana, Crystal Reports, plotly.io, seaborn, Matplotlib

Languages: SQL, PL/SQL, XML, R, Python, SQL, SQL Server, C++, JAVA, HTML, UNIX shell scripting

Applications: Toad for Oracle, Oracle SQL Developer, MS Word, MS Excel, MS Project, MS Power Point, Teradata

Big Data: Hadoop, Scala, PySpark, Hive, MapReduce, Sqoop, Oozie

Operating Systems: Microsoft Windows 9x / NT / 2000/XP / Vista/7/10 and UNIX, Mac OS

Methodologies: Agile, System Development Life Cycle (SDLC), Waterfall Model, CRISP-DM, Data science lifecycle

Cloud services: AWS - Sagemaker, Lambda, EC2, RDS, ALB/NLB, S3, Glue, Kinesis Firehose, GCP - Big Query, Auto ML, DataProc, DataFlow, Cloud Build, GKE

Integration Tools: DevOps, Dockers, Kubernetes, Jenkins, Ansible, PyMongo, API gateways, RESTful

PROFESSIONAL EXPERIENCE

ML/Research Engineer

Confidential, Franklin lakes, NJ

Responsibilities:

Involved in gathering, scoping of business requirements, project goals with Stakeholders & managers while uncovering and defining multiple dimensions.
Participated in continuous interaction with Data Engineering team for obtaining data, data sources, data demand planning and data quality.
Delivered various executive level dynamic dashboard/KPI metrics which talks about user engagement & fraud analytics using Kibana
Utilized big data tools for MLOps like GCP, Big Query, DataProc for streamlining data lakes. AutoML for automating the model building process
Implement Clustering algorithm on customer database to run Classification algorithm to determine whether a transaction is Fraud or not.
Built a basic k-means clustering model to investigate the underlying factors that defines the customers behavior, purpose of sale and transaction life cycle etc.
Used Ensemble methods, k-fold data validations, parameter tuning, retraining & pipelined the models for customer behavior analysis, Data credibility assessment & Unusual transactions.
Leveraged A/B Testing, cross validation techniques that helped detecting the false positives and scoring the performances
Interacted prediction functionality of the models to the web development team using joblib, REST API, API Gateways and Flask
Continuous Integration/Continuous Delivery (CI/CD) of scaled models to production level using Jenkins, and Ansible
Followed Containerization & Orchestration for CI/CD using DevOps tools like Docker - images & containers, Kubernetes with Cloud Build & GKE
Followed Cross Industry Standard Process for Data Mining (CRISP-DM) lifecycle for Data collection & processing under Scrum Methodology
Communicated the results & findings to Stakeholders & Business users

Tools: /Environments: MLOps, A/B Testing, GCP, AutoML, Big Query, Scala, DataProc, Cloud Build, GKE (Google Kubernetes Engine), TensorFlow, REST API, Flask, joblib, CI/CD (DevOps) - Jenkins, Dockers, Kubernetes, Ansible, Elastic Beanstalk, CRISP-DM

Data Scientist

Confidential, Silver Spring, MD

Responsibilities:

Part of Data Science/ML team that drive efforts around risk management, regulatory compliance, and operational efficiency.
Apply best practices and machine learning techniques to identify optimal modelling approaches, and design, build and partnering to implement models that address the business problems.
Implemented predictive models to current business and performed what-if analysis on the structure of models.
Manipulated, processed and analyzed huge data using Python (Pandas) and SQL.
Utilized Apache Spark using Scala, PySpark (API) & Py4j library to analyze large chunks of data.
Leveraged Athena, Glue, Kinesis, EC2 to streamline data lake stored in Amazon S3.
Experience implementing algorithms in regression analysis, trend finding, user engagement, forecasting, statistical analysis of high dimensional data.
Designed easy to follow visualizations using Tableau software and published dashboards on Tableau Online and Tableau Desktop.
Resolve problems through rigorous analytic techniques of machine learning, text mining.
Develop predictive customer churn model for identifying customers with high potential of disconnecting from the network. This was accomplished using models from neural network and logistic regression
Took part in Proof-of-Concept project understanding the vendor contracts using Computer Vision (CV) and Optical Character Recognition (OCR)
Accessed MongoDB database using PyMongo driver library. Created & maintained Aggregation pipelines with methods like match, group, bucket & facet.
Constantly load balance, scale, monitor and update the model service/microservice infrastructure to deploy the Prediction API to Production.
In-depth expertise in Statistical Procedures like Parametric and Non-Parametric Tests, Hypothesis Testing, ANOVA, Interpreting P values.

Tools: /Environments: MongoDB, PyMongo, JSON, Scala, PySpark, Amazon web services-AWS, APIs, Tableau, Python - scikit libraries, Statistical procedures - Hypothesis testing, ANOVA, p-values, AWS - Sagemaker, EC2, S3, RDS, ALB/NLB, Lambda, Athena, Glue, Kinesis Firehose

We provide IT Staff Augmentation Services!

Ml/research Engineer Resume

Franklin Lakes, NJ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship