Data Scientist Resume Cincinnati, OH - Hire IT People

PROFESSIONAL SUMMARY:

7+ years working experience as Data Analyst, Data Scientist with high proficiency in Predictive Modeling, Text mining and Machine Learning.
Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis, Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies.
Extensive experience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating Data Visualizations using R, Python and Tableau.
Proficient in Python, R, C/C++, SQL, Tableau.
Experience in Univariate, Multivariate Analysis, model testing, problem analysis, model comparison and validating model, ANOVA, Regression Analysis.
Expertise in writing complex SQL queries to obtain filtered data for analysis purpose.
Working knowledge in implementing tree - based models such as Boosting, Random Forest.
Experience in using Model Pipelines to automate the tasks and put models into production quickly.
Skilled in System Analysis, Dimensional data Modeling, Database Design and implementing RDBMS specific features.
Experience in using Tableau, creating dashboards and quality story telling.
Worked with various python libraries such as NumPy, SciPy for mathematical calculations, Pandas for data preprocessing/wrangling, Mat plot, Seaborn for data visualization, Sklearn for machine learning or deep leaning and NLTK for NLP.
Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees, Random Forest, SVM, K-Nearest Neighbors, Bayesian, XGBoost) in Forecasting/ Predictive Analytics, Segmentation methodologies, Regression-based models, Hypothesis testing, Factor analysis/ PCA, Ensembles.
Experience in Text Mining and good knowledge on NLP components such as Natural Language Understanding (NLU) and Natural Language Generation (NLG).
Knowledge in Natural Language Processing (NLP) with techniques such as Tokenization, Stemming, Lemmatization, Count-Vectorization, Tf-idf.
Strong software application skills (MS Excel, Access, Word, PowerPoint, Project)
Deep understanding statistical analysis with P-value, A/B testing, Hypothesis testing, Central Limit Theorem, Bayes' Theorem, Distributions.
Highly skilled in using Hadoop, Spark, and Hive for basic analysis and extraction of data in the infrastructure to provide data summarization.
Experience in writing SQL queries, Stored procedures, Functions and Triggers by using PL/SQL.
Expertise in Oracle, My-SQL technologies. Good exposure to plan and executes all the phases of software development life cycle, which include Analysis, design, development, testing.
Hands on experience in implementing Forecasting/ Predictive Analytics, Segmentation methodologies, Regression based models, Factor analysis.
Knowledge in Cloud services such as Microsoft Azure and Amazon AWS.
Strong problem-solving skills, good communication and good team player.
Practiced in clarifying business requirements, performing gap analysis between goals and existing procedures/skillsets, and designing process and system improvements to increase productivity and reduce costs.
Strong understanding of Agile and Scrum Software Development Life Cycle Methodologies.
Involved in the issue resolution and Root Cause Analysis.
Experience in working with different operating systems Windows, UNIX, and Linux.

TECHNICAL SKILLS:

Machine Learning: SQL, T-SQL, PL/SQL, java, C, C++, XML, HTML, MATLAB, DAX, Python, Mat lab R.

Statistical Analysis: R, Python, MATLAB, Minitab, Jupyter

RDBMS: Oracle, SQL Server, MS-Access, Teradata, Hadoop-bigdata

Data Modeling : ERWIN, TOAD, MS Visio .

DWH / BI Tools: Microsoft Power BI, Tableau, SSIS, SSRS, SSAS, Visual Studio, R-Studio

Big Data: Hadoop, Hive, Map reduce, scoop, Impala

IDE’S: NetBeans, Eclipse, PyCharm, PyScripter, PyStudio

Operating System: LINUX, Windows

Methodologies: Agile, RAD, JAD, RUP, UML, System Development Life Cycle (SDLC), Ralph Kimball and Bill Inmon, Waterfall Model.

PROFESSIONAL EXPERIENCE:

Confidential - Cincinnati, OH

Data Scientist

Responsibilities:

Participated in all phases of project life including data collection, data mining data cleaning, developing models, validation, and reports creating.
Responsible for reporting of findings that will use gathered metrics to infer and draw logical conclusions from past and future behavior.
Performed Time series analysis, Multinomial Logistic Regression, Random Forest, Decision Tree, SVM
Used Principal Component Analysis & Factor Analysis in feature engineering to analyze high dimensional data in Python.
Worked on classification/scripting of multiple attribute models by applying SVM and Regular Expressions given product features like title, description etc. & predicting product attribute values using Python
Used R machine learning library to build and evaluate different models.
Implemented rule-based expertise system from the results of exploratory analysis and information gathered from the people from different departments.
Collected data needs and requirements by interacting with the other departments.
Created various types of data visualizations using Python and Tableau schemas.
Communicated the results with operations team for taking best decisions.
Develop new and effective analytics algorithms and wrote the key pieces of mission-critical source code.
Extracted patterns in the structured and unstructured data set and displayed them with interactive charts using ggplot2 and ggiraph packages in R.
Built initial models using supervised classification techniques like K-Nearest Neighbor (KNN), Logistic Regression and Random Forests.
Used a variety of NLP methods for information extraction, topic modeling, parsing, and relationship extraction.
Experience in Machine learning using NLP text classification using Python.
NLP (text mining and analysis, topic modeling, Ngram, and Emotion analysis) to extract clinical data from text
Used nltk package in python to work on Natural Language Processing (NLP) tasks
Used R and Python for programming for improvement of model. Upgraded the entire models for improvement of the product.
Performed data Transformation method for Re scaling and Normalizing Variables
Used packages like Dplyr, tidyr& ggplot2 in R Studio for data visualization and generated scatter plot and high low graph to identify relation between different variables.
Created various types of data visualizations using Python and Tableau.
Implemented advanced machine learning algorithms including regression trees, kernel PCA, among other methods in Python and R and in other tools and languages as needed.
Designing a machine learning pipeline to predict and prescribe and Implemented a machine learning scenario for a given data problem.

Environment: Python, R/R Studio, Tableau, PL /SQL

Confidential - Frederick, MD

Data Scientist

Responsibilities:

Evaluating the data analytics opportunities to improve the efficiency of claims handling process like Fraud Detection
Utilized various data analysis and data visualization tools to accomplish data analysis, report design and report delivery.
Create statistical models based on researched information to provide conclusions that will guide the company and the industry into the future.
Taking care of missing data after import and encoding the categorical data, when needed.
Splitting the data into training set, test set and scaling the data in training set and test set, if necessary.
Creatively communicated and presented models to business customers and executives, utilizing a variety of formats and visualization methodologies.
Impact of marketing tactics on sales and then forecast the impact of future sets of tactics.
Developed SQL code to extract data from various databases
Used R and python for Exploratory Data Analysis and Hypothesis test to compare and identify the effectiveness of Creative Campaigns.
As part of my job responsibilities, I have implemented different machine learning models like regression, classification and clustering.
Used Python, R and SQL to create Statistical algorithms involving Linear Regression, Logistic Regression, Random forest, Decision trees for estimating the risks.
Developed statistical models to forecast inventory and procurement cycles.
Created and designed reports that will use gathered metrics to infer and draw logical conclusions of past and future behavior.
I have implemented machine learning algorithms as part of my job responsibilities using sci-kit learning.
Work with a range of proprietary, industry standard, and open source data stores to assemble and organize and analyze data.
Visualizations, Summary Reports and Presentations using R and Tableau.

Environment: R, Python, Tableau, SQL Server

Confidential - Lewisville, TX

Sr. Data Analyst

Responsibilities:

Worked closely with the Data Governance Office team in assessing the source systems for project Deliverables.
Used T-SQL queries to pull the data from disparate systems and Data warehouse in different environments.
Used Data Quality validation techniques to validate Critical Data elements (CDE) and identified various anomalies.
Presented DQ analysis reports and score cards on all the validated data elements and presented- to the business teams and stakeholders.
Involved in defining the Source To Target data mappings, business rules, data definitions.
Extensively used open source tools - R Studio(R) and Spyder (Python) for statistical analysis and building the machine learning.
Interacting with the Business teams and Project Managers to clearly articulate the anomalies, issues, findings during data validation.
Performing Data Validation / Data Reconciliation between disparate source and target systems (Salesforce, Cisco-UIC, Cognos, Data Warehouse) for various projects.
Extracting data from different databases as per the business requirements using Sql Server Management Studio (SSMS).
Writing complex SQL queries for validating the data against different kinds of reports generated by Cognos.
Extensively using MS Excel (Pivot tables, VLOOKUP) for data validation.
Interacting with the ETL, BI teams to understand / support on various ongoing projects.
Generating weekly, monthly reports for various business users according to the business requirements. Manipulating/mining data from database tables (Redshift, Oracle, Data Warehouse).
Create automated metrics using complex databases.
Providing analytical network support to improve quality and standard work results.
Experience with Version Control (Git)
Create data pipelines using big data technologies like Hadoop, spark etc.
Create statistical models using distributed and standalone models to build various diagnostics, predictive and prescriptive solution.
Utilize a broad variety of statistical packages like SAS, R, MLIB, Graphs, Hadoop, Spark, Map Reduce, Pig and others
Interface with other technology teams to extract, transform, and load (ETL) data from a wide variety of data sources
Provides input and recommendations on technical issues to BI Engineers, Business & Data Analysts and Data Scientists.

Environment: AWS, MS Azure, Cassandra, Spark, HDFS, Hive, Pig, Linux, SPSS, MySQL, Eclipse, PL/SQL, SQL connector, Tableau

Confidential - San Antonio, TX

Data Analyst

Responsibilities:

Gathering the requirements by interacting heavily with the business users, multiple technical teams to design and develop the workflows for the new functional piece.
Collaborated with various business stakeholders to create Business Requirement Document (BRD), translated gathered high-level requirements into a Functional Requirement Document (FRD) to assist implementation side SMEs and developers, along with data flow diagrams, user stories and use cases
Part of a Scrum Agile team.
Experience in SQL joins, sub queries, tracing and performance tuning for better running of queries
Extensively used joins and sub queries for complex queries involving multiple tables from different databases.
Performance tuning of stored procedures and functions to optimize the query for better performance.
Successfully implemented indexes on tables for optimum performance.
Developed complex stored procedures using T-SQL to generate Ad-hoc reports within SQL Server Reporting services.
Strong analytical, problem solving skills coupled with interpersonal, and leadership skills
Interaction with the clients to gather out the requirements and assist them in immediate workarounds for the issues with the application.
Developed detailed test scenarios as documented in business requirements documents, assisted the test team with UAT.
Real time usage of Tableau for analytical purpose

Environment: MY SQL, MS Power Point, MS Access, T- SQL, MS Power Point, MS Access, T-SQL, DTS, SSIS, SSRS, SSAS, ETL, MDM, Teradata.

Confidential

Analyst

Responsibilities:

Assisted Product Manager in documenting business / functional requirements and defining product features
Performed various SQL / database activities to support data review, data mapping, data quality review, data validation and report generation
Participated in the development of end-to-end test plan with regression strategy
Developed QA test plans, test conditions, test cases and test scripts to ensure complete and adequate testing as well as coordinating / conducting user acceptance testing (UAT)
Monitored testing progress and performance of testing including open defects
Performed functional, UI, performance and back-end testing, in conjunction with the QA team
Involved in analyzing functional specifications and created manual test plans and test cases
Collaborated with the development team to solve the problems encountered in the test scenario runs
Documented and communicated test results to the relevant stakeholders including Product Manager.

Environment: UNIX, MS Visio, MS Project, MS SharePoint, HTML, Windows, Oracle, PL/SQL

We provide IT Staff Augmentation Services!

Data Scientist Resume

Cincinnati, OH

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship