Data Scientist Resume Seattle, WA - Hire IT People

SUMMARY

4 years of experience in Data Analysis, Machine Learning, Data mining with large datasets of Structured and Unstructured data.
Strong skills in Machine Learning algorithms such as Linear Regression, Logistic Regression, Naive Bayes, Decision Tree, Random Forest, Support Vector Machine, K - Nearest-Neighbors, K-means Clustering, Neural networks, Ensemble Methods using Python Pandas, NumPy and scikit-learn.
Proficient in Predictive Modeling, ANOVA, Hypothetical testing, A/B testing and advanced statistical techniques.
Proficient at building robust Machine Learning, Deep Learning models, Convolution Neural Networks (CNN), Recurrent Neural Networks (RNN) using TensorFlow, Keras and Pytorch in Python.
Worked on Natural Language Processing (NLP) (Topic modeling, sentiment analysis, text classification) using BOW, n-grams, NLTK, TF-IDF, Word2Vec, Doc2Vec, spacy and gensim.
Knowledge in Cloud services such as Microsoft Azure and Amazon AWS.
Adept in analyzing large datasets using Apache Spark, PySpark, Spark ML and Amazon Web Services (AWS) and knowledge in Big Data ecosystem components like Hadoop, MapReduce, Spark, Pig, Hive.
Knowledge in SQL database servers and NoSQL Databases such as HBase, Cassandra and MongoDB.
Worked on web scrapping tools like Scrapy and Beautiful Soup in python to extract data form websites.
Hands on experience in python Web frameworks like Django and Flask, Bootstrap (Front-end framework).
Experience in visualization tools like Tableau and Plotly for web apps, matplotlib and seaborn in Pandas.
Knowledge and experience in GitHub/Git version control tools.

PROFESSIONAL EXPERIENCE

Confidential, Seattle, WA

Data Scientist

Responsibilities:

Perform data manipulation, data preparation, normalization, and predictive modelling. Improve efficiency and accuracy by evaluating model in Python.
Implemented, tuned and tested the models on Amazon SageMaker, Jupyter Notebooks in EC2 and Microsoft Azure with the best performing algorithm and parameters.
Used Python and R for programming for improvement of model. Upgrade the entire models for improvement of the product.
Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.
Design built and deployed a set of python modelling APIs for customer analytics, which integrate multiple machine learning techniques for various user behavior prediction and support multiple marketing segmentation programs.
Validated the machine learning classifiers using ROC Curves and Lift Charts.
Presented Dashboards to Higher Management for more Insights using Power BI.
Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring.
Performed Boosting method on predicted model for the improve efficiency of the model.
Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom, visualization tools using R, Tableau, and Power BI.
Collaborating with the project managers and business owners to understand their organizational processes and help design the necessary reports.

Confidential, Seattle, WA

Data Scientist

Responsibilities:

Participated in all phases of data acquisition, data cleaning, developing models, validation, and visualization using python Pandas, NumPy, Scikit-learn, R to deliver data science solutions.
Develop models to accurately detect instances of fraud for further action using Logistic Regression, Decision Trees, Random Forest and Neural networks provided by Scikit-learn in python.
Designed and implemented Cross-validation and statistical tests including k-fold, stratified k-fold, hold-out scheme to test and verify the models' significance.
Design built and deployed a set of python modelling APIs for customer analytics, which integrate multiple machine learning techniques for various user behavior prediction and support multiple marketing segmentation programs
Validated the machine learning classifiers using ROC Curves and Lift Charts.
Segmented the customers based on demographics using K-means Clustering
Explored different regression and ensemble models in machine learning to perform forecasting
Presented Dashboards to Higher Management for more Insights using Power BI
Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring
Performed Boosting method on predicted model for the improve efficiency of the model
Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom, visualization tools using R, Tableau, and Power BI
Collaborating with the project managers and business owners to understand their organizational processes and help design the necessary reports
Articulating business questions and using mathematical techniques to arrive at an answer using available data.

Confidential

Jr Data Analyst

Responsibilities:

Extensively involved in all phases of data acquisition, data collection, data cleaning, model development, model validation and visualization to deliver data science solutions.
Built machine learning models to identify whether a user is legitimate using real-time data analysis and prevent fraudulent transactions using the history of customer transactions with supervised learning.
Extracted data from SQL Server Database copied into HDFS File system and used Hadoop tools such as Hive and Pig Latin to retrieve the data required for building models.
Performed data cleaning including transforming variables and dealing with missing value and ensured data quality, consistency, integrity using Pandas, NumPy.
Tackled highly imbalanced Fraud dataset using sampling techniques like under sampling and oversampling with SMOTE (Synthetic Minority Over-Sampling Technique) using Python Scikit-learn.
Utilized PCA, t-SNE and other feature engineering techniques to reduce the high dimensional data, applied feature scaling, handled categorical attributes using one hot encoder of scikit-learn library
Developed various machine learning models such as Logistic regression, KNN, and Gradient Boosting with Pandas, NumPy, Seaborn, Matplotlib, Scikit-learn in Python.
Identified and evaluated various distributed machine learning libraries like Mahout, MLLib (Apache Spark) and R.
Worked on Amazon Web Services (AWS) cloud services to do machine learning on big data.
Developed Spark Python modules for machine learning & predictive analytics in Hadoop.
Implemented a Python-based distributed random forest via PySpark and MLlib.

We provide IT Staff Augmentation Services!

Data Scientist Resume

Seattle, WA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship