We provide IT Staff Augmentation Services!

Data Scientist Resume

4.00/5 (Submit Your Rating)

CA

SUMMARY

  • 8 years of experience in Data Science and Analytics including Machine Learning, Data Mining and Statistical Analysis.
  • Experience with Python libraries including NumPy, Pandas, SciPy, Matplotlib, Seaborn, Tensorflow, Keras and libraries including Ggplot2.
  • Experience in performing data analysis on various IDE’s like Jupyter Notebook, Spyder.
  • Good experience in Text Analytics, generating data visualizations using R, Python and creating dashboards using tools like Tableau.
  • Good working with SAS Language for validating data and generating reports.
  • Good knowledge in the entire Data Science process life cycle including Data Acquisition, Data Preparation, Data Manipulation, Validation and Visualization.
  • Knowledge and experience of extracting information from text format data using Natural Language Processing (NLP) methods.
  • Proficient in Machine Learning, Data/Text Mining, Statistical Analysis & Predictive Modeling.
  • Understanding of Data Warehousing principles like Fact Tables, Dimensional Tables, Dimensional Data modelling - Star Schema and Snow Flake Schema.
  • Participated in feature engineering such as feature intersection generating, feature normalize and label encoding with Scikit-learn preprocessing.
  • Proficient in Statistical Methodologies including Hypothetical Testing, ANOVA, Time Series, Cluster Analysis.
  • Worked with NoSQL Database including MongoDB and PostgreSQL.
  • Query optimization, execution plan and Performance tuning of queries for better performance in SQL.
  • Experience in creating and implementing interactive charts, graphs, and other user interface elements and developed analytical reports for monthly risk management presentation to higher management.
  • Good working knowledge and thorough exposure on compatibility issues with different versions of browsers like Internet Explorer, Mozilla Firefox and Google Chrome.
  • Hands on experience with version control systems like GIT, GitHub.
  • Hands on experience using various Operating Systems like Windows and Linux.
  • Good analytical, problem solving, communication and interpersonal skills, with ability to interact with individuals at all.

TECHNICAL SKILLS

  • SDLC, Agile, Waterfall, Python, R, SAS, TensorFlow, Keras, Matplotlib, NumPy, Seaborn, Scikit-learn,
  • Ggplot2, Jupyter Notebook, Spyder, NLP, PL/SQL, OLAP, Data Warehouse, Text mining, Regression, Clustering, Time Series, hypothesis testing, ANOVA, PostgreSQL, MongoDB, Oracle, Tableau, GIT, GitHub, MS Visio, Windows, Linux

PROFESSIONAL EXPERIENCE

Confidential, CA

Data Scientist

Responsibilities:

  • Worked in Agile environment, with an ability to accommodate and test the newly proposed changes at any point of time during the release.
  • Coded, tested, debugged, implemented and documented data using Python.
  • Work with Development IDE like Jupyter Notebook, Spyder.
  • Worked on data cleaning and ensured data quality, consistency, integrity using Pandas and NumPy.
  • Perform mathematical and numerical modeling, time series analysis, data mining and machine learning with applications in engineering and oceanography.
  • Applied various data mining techniques: Linear Regression & Logistic Regression, classification, clustering.
  • Actively contributed to various phases of the project starting with data collection from a hybrid of sources, cleaning, aggregation, dimensionality reduction and statistical modelling.
  • Visualizing and presenting dashboards to stakeholders using Tableau & Ggplot2 by utilizing various plotting techniques.
  • Performed data imputation using Scikit-learn package in Python.
  • Network Utilization Analytics, using Cluster, Regression and Anova Models.
  • Imported the customer data into Python using Pandas libraries and performed data analysis - found patterns in data which helped to make key decisions for the company.
  • Visualized the data using Matplotlib & Seaborn libraries in Python for detecting outliers, missing values and interpreting variables and stored the customer and transaction data in MongoDB after cleaning.
  • Wrote database complex SQL queries in Oracle and PostgreSQL developed data models for data analysis and extraction.
  • Built database mapping classes using NoSQL Database like MongoDB.
  • Moved the data science eco system into Git version control to track changes across teams.

Confidential

Jr. Data Scientist

Responsibilities:

  • Worked in Agile environment, with an ability to accommodate and test the newly proposed changes at any point of time during the release.
  • Worked on customer segmentation using an unsupervised learning technique - clustering.
  • Utilized various new supervised and unsupervised machine learning algorithms to perform NLP tasks and compare performances.
  • Involved in the entire data science project life cycle and actively involved in all the phases including data extraction, data cleaning, statistical modeling and data visualization with large data sets of structured and unstructured data.
  • Used SAS for pre-processing data, SQL queries, data analysis, generating reports, graphics and statistical analyses.
  • Involved in designing the Data models like star schema to incorporate the Fact and Dimension tables.
  • Created deep learning models using Tensorflow and Keras by combining all tests as a single normalized score and predict residency attainment of students.
  • Data wrangling to clean, transform and reshape the data utilizing NumPy and Pandas library.
  • Data validation and cleansing of staged input records was performed before loading into Data Warehouse.
  • Identifying and executing process improvements, hands-on in various technologies such as Oracle.
  • Holding a knowledge working with SourceTree with GitHub.

Confidential

Data Analyst

Responsibilities:

  • Developed business process models in Waterfall to document existing and future business processes.
  • Performing statistical data analysis and data visualization using Python and R.
  • Experience in data mining with data sets, Knowledge in MySQL, SQL server relational database system.
  • Generating various capacity planning reports (graphical) using Python packages like NumPy, matplotlib.
  • Data Warehousing principles like Fact Tables, Dimensional Tables, Dimensional Data Modelling - Star Schema and Snow Flake Schema.
  • Created action filters, parameters and calculated sets for preparing dashboards and worksheets in Tableau.
  • Performance tuning and SQL query tuning. Write and tune the PL/SQL code effectively to maximize performance.
  • Worked with various Transformations like Joiner, Expression, Lookup, Aggregate, Filter, Update Strategy, Stored procedure and Normalizer.
  • Worked extensively on Data Profiling, Data cleansing, Data Mapping and Data Quality.
  • Delivered data solutions in report/presentation format according to customer specifications and timelines.
  • Involved in developing UML usecase diagrams, Class diagrams and diagrams using MS Visio.

We'd love your feedback!