Machine Learning/ Artificial Intelligence Engineer Resume
SUMMARY:
- A Data Scientist and Machine Learning Engineer with a solid background in Computer Science and Software Development Life Cycle. Experience working in a fast - paced
- Data-driven setting fostering collaborations to maintain the edge necessary for delivering products and services built on innovative technologies like Artificial Intelligence and Machine Learning.
- Natural knack for diving deep into a problem to find a creative solution with a roll-up-your-sleeve attitude.
TECHNICAL SKILLS:
Machine Learning Algorithms: Linear regression, SVM, KNN, Naive Bayes, Logistic Regression, Random Forest, Boosting, K-means clustering, Hierarchical clustering, Collaborative Filtering, Neural Networks, NLP
Analytic Tools: R 2.15 / 3.0 (Reshape, ggplot2, Dlpr, Car, Mass and Lme4), Excel, Data Studio
Programming Language: R 3.X, Python 2.X & 3.X (numpy, scipy, pandas, seaborn, beautiful soup, scikit-learn, NLTK), SQL, C
Database: Postgre SQL, Oracle 11g, MySQL, SQL Server, MongoDB, Neo4j
Big-Data Framework: Hadoop Ecosystem 2.X (HDFS, MapReduce, Hive 0.11, Hbase 0.9), Spark Framework 2.X (Scala 2.X, SparkSQL, Pyspark, SparkR, Mllib)
Data Visualization: Tableau 8.0 /9.2 / 10.0, Plotly, R-ggplot2, Python-Matplotlib, Logi Analytics
Version Control: Git 2.X
Operation System: UNIX, MacOS, Windows
PROFESSIONAL EXPERIENCE:
Confidential
Machine Learning/ Artificial Intelligence Engineer
Responsibilities:
- Lead the effort to create an Enterprise Data Lake on Azure Cloud. Consolidation of various data sources from GCP and AWS to establish ‘single version of truth’ and enable efficient analytics capabilities.
- Lead a team of Data Scientists and Engineers to develop and productionize various versions of Machine Learning and Natural Language processing models in novel microservices architectures of batch scoring and real-time API serving of predictions.
- Implementation of novel iterative development procedures on jupyterLab based IDE AI Notebooks
- Using Data flow and python to build dynamic data workflow pipelines to serve various experiments.
- Lead a team of data scientists and engineers to deploy ML pipelines in production using docker and Kubeflow and write automated test cases to maintain and monitor them.
- Managed resource roles and utilization through IAM and Terraform to manage development, test and production environments
- Performed and documented associated collaboration mechanisms like stand-ups, sprints according to Agile development principles acting as the Scrum Master.
- Developed a new data scheme for the data consumption store for the Machine Learning and AI models to quicken the processing time using SQL, Hadoop, and Cloud services.
- Developed novel and efficient reporting architectures to report KPIs to relevant internal and external stakeholders using Google data studio and tableau server.
Tool: Stack: Agile, Python 2.7, 3.1, AirFlow, AWS, Azure Cloud, Databricks, Spark, PySpark, PyTorch, SQL, Airflow, Zeppelin, Docker, Kubernetes, Tableau
Confidential
Senior Data Scientist/ Sr. Solutions Architect
Responsibilities:
- Part of an innovative and diverse team of data scientists, engineers, developers, and product managers to strategically capitalize the rich data ecosystem to leverage ML and AI technologies to better serve the customer.
- Lead multiple AI and Machine Learning programs within our product suits; Programs include - Predictive Analytics service - baselining and forecasting of performance and security KPIs, Security Analytics - Anomaly detection service - clustering of devices based on behavior over time - NLP, LSTM, KubeFlow, Docker, AWS Sagemaker, AWS Greengrass.
- Manage full range of project management activities - approve project plans, prioritize activities, manage engagements, assign resources, plan budget, manage expectations and align stakeholders.
- Manage Data Storage and processing pipelines in GCPfor serving AI and ML services in Production, development and testing using SQL, Spark, Python and AI vm
- Research and document advanced solutions architecture for other enterprise customers including building CI/CD pipelines and a robust monitoring system for model performance.
- Develop and create extensive BI capabilities to leverage our data and add value to customers IT.
- Project Athena - Leveraging GCP Machine Learning and Artificial Intelligence to develop predictive analytics engine served through an API; microservices based containerized service, fully scalable.
- Predictive maintenance and behavior modeling for manufacturing and health care IoT data using LSTMs
- Microservices based architecture for Machine Learning based applications. XGboost implementation.
Tool: Stack: Git, Jira, Slack,AirFlow, Python 2.7, 3, Tensor Flow, Keras, FB Prophet, SQL, PySpark, SparkQL, GCP, AWS, Lambda, Sagemaker, Kubernetes, Docker, Tableau, Logi Analytics, REST API
Confidential
Data Scientist / Product owner
Responsibilities:
- Worked on Price optimization, offer optimization for marketing, order sorting use cases and assisted in IoT use cases: beacons.
- Lead and Owned all the steps of a Machine Learning service development Lifecycle from exploratory analysis to deployment of model in production.
- Conducted data preprocessing, cleaning, and filtering with Pandas, Exploratory analysis and Data Integrity analysis.
- Analysis of customer behavior and value using Uplift models. K-means and uplift random forest implementation with Hadoop Mapreduce.
- Time series analysis of data using Spark ensuring Scale and Speed to detect events, establish thresholds to model behavior and predict anomalous events.
- Designed and implemented new data collection strategies from newly added data sources to be integrated into analytics data warehouse in cloud using both Flume and Kafka to ensure ease of use on both ends.
- Predictive data models for forecasting are developed whose effectiveness is measured executing A/B tests and feedback is gathered to improve the models.
- Development and presenting the results and working of our model in the form of detailed reports and dashboards to executives before production.
- Evaluation of Marketing strategies using quality analytics and .
- Development and maintenance of customer satisfaction index.
Tool: Stack: R 3.X, Python 3.X, AWS, GCP, MS SQL server 2008, Git, Oracle 12c, HDFS, MongoDB 3.3, Spark 1.6 / 2.0 (SparkR, MLlib, Spark SQL), Tableau 10.0, Git 2.X
Confidential
Vice President - Data Analytics
Responsibilities:
- Recruited, built, and managed a data team of 20+ data analysts, engineers and developers.
- Led 13 agile product deliveries with ~50 resources to developing, integrating and delivering creative data solutions to meet business requirements, strategic use cases and products e.g. optimized portfolio modeling, targeted marketing, KPI forecasting.
- Defined and implemented digital strategy resulting in 15% increase in digital transactions and 10% conversions.
- Re engineered the reporting architecture of the product with data visualization; drove digitization and self-service to drive actionable intelligence and increase decision making TAT by 70%.
- Engineered a Data Warehouse from various data sources in a SQL based relational database with further data management to assist reporting.
- Implemented data governance policies, controls, standards/procedures to meet regulatory standards.
- Acted as a Data Evangelist inside the organization to improve the data culture and establishing effective partnerships with IT and operations.
Environment: MS Excel, PowerPoint, Access, Outlook, SPSS, MS Visio, R, Tableau, SQL Server MSDB 2, Excel, SSAS and other data sources are integrated for modeling and analytics.