We provide IT Staff Augmentation Services!

Data Scientist Resume

0/5 (Submit Your Rating)

Baltimore, MD

SUMMARY

  • 6 Plus years of relevant work experience as a Data Scientist, including deep expertise and experience with Statistical Analysis, Data Mining and Machine Learning Skills using R, Python and SAS.
  • Data Driven and highly analytical with working knowledge and statistical model approaches and methodologies (Clustering, Regression analysis, Hypothesis testing, Decision trees, Machine learning), rules and ever evolving regulatory environment.
  • Professional working experience in Machine Learning algorithms such as Linear Regression, Logistic Regression, Naive Bayes, Decision Trees, K - Means Clustering and Association Rules.
  • Working experienced of statistical analysis using R, SPSS, Matlab and Excel.
  • Experience with traditional analytics tools (Excel and Tableau)
  • Hands on experience in writing queries in SQL and R to Extract, Transform and Load (ETL) data from large datasets.
  • Experience with analyzing online user behavior, Conversion Data (A/BTesting) and customer journeys, funnel analysis.
  • Strong Data Analysis skills using business intelligence, SQL and / or MS Office Tools.
  • Experience working in Agile/Scrum Methodologies to accelerate Software Development iteration.
  • Experience in applying predictive modeling and machine learning algorithms for analytical reports.
  • Profound Analytical and problem solving skills along with ability to understand current business process and implement efficient solutions to issues/problems.
  • Experience using technology to work efficiently with datasets such as scripting, data cleansing tools, statistical software packages.
  • Strong understanding of how analytics supports a large organization including being able to successfully articulate the linkage between business objectives, analytical approaches &findings and business decisions.
  • Excellent analytical skills with demonstrated ability to solve problems.
  • Mastery of R programming/data processing experience knowledge in SPSS.
  • Ability to work with large transactional databases across multiple platforms (Teradata, Oracle, HDFS, SAS).
  • High Proficiency in Excel including complex data analysis and manipulation.
  • Good oral and written communication skills.
  • Strong interpersonal skills to successfully build long-term relationships with colleagues and business partners.
  • A results-driven individual with a passion for data/analytics who can work collaboratively with others to solve business problems that drive business growth.
  • Demonstrated leadership and self-direction. Demonstrated willingness to both teach others and learn new techniques.
  • Ability to work with managers and executives to understand the business objectives and deliver as per the business needs and a firm believer in teamwork.

TECHNICAL SKILLS

Programming & Scripting Languages: R, C, C++, JAVA, JCL, COBOL, HTML, CSS, JSP, Java Script

Database: SQL, MySQL, MS Access, Oracle

Statistical Software: SPSS, R, SAS

Web Packages: Google Analytics, Adobe Test & Target, Web Trends

Development Tools: R Studio, Notepad++, PyCharm IDE, Jupyter, Spyder IDE

Writing Tools: Latex

Packages: Dplyr, rjson, GGPLOT2, NumPy, SciPy, Pandas, matplotlib

Techniques: Machine learning, Regression, Clustering, Data mining.

Machine Learning: Naive Bayes, Decision trees, Regression models, Random Forests, Time-series, K-means

Business Analysis: Requirements Engineering, Business Process Modeling & Improvement, Financial Modeling

Operating Systems: Microsoft windows 7/8/8.1/10/Vista/XP, Linux (Ubuntu)

PROFESSIONAL EXPERIENCE

Confidential, Baltimore, MD

Data Scientist

Responsibilities:

  • Work independently and collaboratively throughout the complete analytics project lifecycle including data extraction/preparation, design and implementation of scalable machine learning analysis and solutions, and documentation of results.
  • Performed statistical analysis to determine peak and off-peak time periods for ratemaking purposes
  • Conducted analysis of customer data for the purposes of designing rates.
  • Identified root causes of problems, and facilitated the implementation of cost effective solutions with all levels of management.
  • Application of various machine learning algorithms and statistical modeling like decision trees, regression models, clustering, SVM to identify Volume using scikit-learn package in R.
  • Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.
  • Hands on experience in implementing Naive Bayes and skilled in Random Forests, Decision Trees, Linear and Logistic Regression, SVM, Clustering, Principle Component Analysis.
  • Performed K-means clustering, Regression and Decision Trees in R.
  • Worked on Naïve Bayesian algorithms for Agent Fraud Detection using R.
  • Have knowledge on A/B Testing, ANOVA, Multivariate Analysis, Association Rules and Text Analysis using R.
  • Developed Regression Models based on data provided by the client.
  • Work independently or collaboratively throughout the complete analytics project lifecycle including data extraction/preparation, design and implementation of scalable machine learning analysis and solutions, and documentation of results.
  • Partner with technical and non-technical resources across the business to leverage their support and integrate our efforts.
  • Partner with infrastructure and platform teams to configure, tune tools, automate tasks and guide the evolution of internal big data ecosystem; serve as a bridge between data scientists and infrastructure/platform teams.
  • Worked on Text Analytics and Naive Bayes creating word clouds and retrievingdatafrom social networking platforms.
  • Pro-actively analyze data to uncover insights that increase business value and impact.
  • Support various business partners on a wide range of analytics projects from ad-hoc requests to large-scale cross-functional engagements
  • PreparedDataVisualization reports for the management using R
  • Approach analytical problems with an appropriate blend of statistical/mathematical rigor with practical business intuition.
  • Hold a point-of-view on the strengths and limitations of statistical models and analyses in various business contexts and is able to evaluate and effectively communicate the uncertainty in the results.
  • Application of various machine learning algorithms and statistical modeling like decision trees, regression models, SVM, clustering to identify Volume using scikit-learn package in python, Matlab.
  • Worked on different data formats such as JSON, XML and performed machine learning algorithms in Python.
  • Approach analysis in multiple ways in order to evaluate approaches and compare results.

Confidential

Data Scientist

Responsibilities:

  • Responsible for Retrieving data using SQL/Hive Queries and perform Analysis enhancements and documentation of the system.
  • Used R, SAS and SQL to manipulate data, and develop and validate quantitative models.
  • Experience on support provided for Cloudera (CDH3).
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
  • Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.
  • Brainstorming sessions and propose hypothesis, approaches and techniques.
  • Created and optimized processes in the Data Warehouse to import retrieve and analyze data from the CyberLife database.
  • Analyzed data collected in stores (JCL jobs, stored-procedures and queries) and provided reports to the Business team by storing the data in excel/SPSS/SAS file.
  • Backup configuration and Recovery from a Name Node failure.
  • Performed Analysis and Interpretation of the reports on various findings.
  • Performing Exploratory Data Analysis on the data provided by the Client.
  • Prepared Test documents for zap before and after changes in Model, Test and Production regions.
  • Responsible for production support Abend Resolution and other production support activities and comparing the seasonal trends based on the data by Excel.
  • Used advanced Microsoft Excel functions such as pivot tables and VLOOKUP in order to analyze the data and prepare programs.
  • Successfully implemented migration of client’s requirement application from Test/DSS/Model regions to Production.
  • Prepared SQL scripts for ODBC and Teradata servers for analysis and modeling.
  • Provided complete assistance of the trends of the financial time series data.
  • Various statistical tests performed for clear understanding to the client.
  • Implemented procedures for extracting Excel sheet data into the mainframe environment by connecting to the database using SQL.
  • Provided training to Beginners regarding the CyberLife system and other basics.
  • Complete support to all regions. (Test/Model/System/Regression/Production).
  • Actively involved in Analysis, Development and Unit testing of the data.
  • Generated reports of more than 100 Agent Fraud Investigation cases based on the client requirement and making sure the data is accurate.
  • Complete delivery assurance of the project.

Confidential

Jr. Data Scientist

Responsibilities:

  • Converted various SQL statements into stored procedures thereby reducing the number of database accesses.
  • Responsible for architecting analytic frameworks for data mining, ETL, analysis, and reporting under the supervision of the Manager.
  • Prepared regular patient reports by collecting samples of Diagnosed Patients using Excel spreadsheets.
  • Cleaned data by analyzing and eliminating duplicate and inaccurate data (outliers) using R.
  • Worked in Agile Environment.
  • Ensure that there are no missing values in the dataset and can be used for further Analysis.
  • Trained in Basics of Data Scientist and implemented those software applications in collecting and managing patient data in Excel/SPSS.
  • Assisted in performing statistical analysis of the data and storing them in a database.
  • Worked with Quality Control Teams to develop Test Plan and Test Cases.
  • Involved in designing and implementing the data extraction (XML DATA stream) procedures.
  • Generated graphs and reports using ggplot in R Studio for analyzing models.
  • Generating the Results and predicting the Accuracy.
  • Preparing the Final Documents and ensure delivery to the Client before EOD.

We'd love your feedback!