Data Scientist Resume
Celebration Place, FloridA
PROFESSIONAL SUMMARY:
- Over 12 years of experience in design, development and implementation of analytical solutions and data analysis with profound expertise in banking/finance/Marketing domains.
- Professional qualified Data Scientist withover 5.5 yearsof experience onData Science and Analyticsin Banking, Insurance and Marketing Domains
- Rich Experience in managing entiredata science project life cycleand involved in all phases, Includingdata extraction, data cleaning, statistical modelinganddata visualization,with large datasets ofstructuredandunstructured data.
- Hands - on experience inMachine Learningalgorithms using Python such asLinear Regression, SVM KNN, LDA/QDA, Naive Bayes, Random Forest, SVM, K-means clustering, Hierarchical clustering, PCA, Feature Selection andNLP
- Professional working experience withPython 2.X / 3.Xlibraries includingMatplotLib, Numpy, Scipy, Pandas, Beautiful Soup, Seaborn, Scikit-learnandNLTKfor analysis purpose.
- Experience in implementing data analysis with various analytic tools, such asAnaconda 4.0 / 2.X (Jupyter Notebook, Spyder), R 2.15 / 3.0 (Reshape, ggplot2), SAS 9.4
- High-level experience in Base SAS, SAS/Macros, SAS/SQL, SAS/STAT, SAS/Connect, SAS/Access, SAS/GRAPH, SAS/ETS, SAS/DI STUDIO, SAS/ODS, SAS EG.
- Working experience with SAS tools, SAS Enterprise Guide, SAS Enterprise Miner.
- Expertise in ETL processing on various data sources like Teradata, Oracle, DB2, SQL server, MS Excel and text files. Experienced in using SAS ODS to create output data sets and RTF, HTML, PDF, CSV Files Strong Problem Analysis & Resolution skills and ability to work in Multi-Platform Environments like Windows, Unix & MVS
- Proficiency in importing different types of external files (Excel, CSV, txt, etc.) IntoSASlibraryand exporting andSASDATAsteps.
- Involved in all stages of SDLC (Software Development Life Cycle).
- Thorough noledge on Database Programming Languages such as SQL.
PROFESSIONAL EXPERIENCE
Confidential, Celebration place, Florida
Data Scientist
Responsibilities:
- Performed Data Profiling to learn about behavior with various features such as household size, age category.
- Application of various machine learning algorithms and statistical modeling like decision tree,
- Random forest, regression models, SVM, clustering to identify Volume using scikit-learn package in python.
- Used clustering technique K-Means to identify outliers and to classify unlabeled data.
- Evaluated models using Cross Validation
- Ensured that the model has low False Positive Rate.
- Provided complex reports, including summaries, charts, and graphs to interpret findings to team and stakeholders.
- Identified process improvements that significantly reduce workloads or improve quality.
- ETL processing using SAS to Teradata database using base SAS
- Addressed overfitting by implementing of the algorithm regularization methods like L2 and L1.
- Used TEMPPrincipal Component Analysis in feature engineering to analyze high dimensional data.
- Created and designed reports that will use gathered metrics to infer and draw logical conclusions of past and future behavior.
- Performed Multinomial Logistic Regression, Random forest, Decision Tree, SVM .
- Created High level and Low level design documents from Business requirements
- Delivered all the business requests with zero defects.
Environment: Python 3.X, Teradata 14, Hive 0.11, HDFS, Spark 1.4, Git 2.X, SAS SAS/ Enterprise Guide 9.4, Windows, Editplus, Putty
Confidential, Celebration place, Florida
Data Scientist
Responsibilities:
- Participated in determining the criteria for Small Business Marketing campaigns/Tactics and developed machine learning algorithm to identify the success rate of different tactics.
- Working with end-users in defining analysis requirements, and performing statistical analysis
- Used Python 3.X and Spark 1.4 (PySpark, MLlib) to implement different machine learning algorithms including Generalized Linear Model, SVM, Random Forest, Boosting.
- Evaluated and optimized performance of models, tuned parameters with K-Fold Cross Validation.
- Developed an automated system for updating Canadian compliancy indicators for marketing
- Identified risk level and eligibility of new card applicants with Machine Learning algorithms.
- Provided complex reports, including summaries, charts, and graphs to interpret findings to team and stakeholders.
- Identified process improvements that significantly reduce workloads or improve quality.
- ETL processing using SAS to Teradata database using base SAS
- Utilized SQL and HiveQL to query, manipulate data from variety data sources including Oracle 10g and HDFS, while maintaining data integrity.
- Worked on data cleaning, data preparation and feature engineering with Python 3.X including Numpy, Scipy, Pandas, Matplotlib, Seaborn and Scikit-learn.
- Worked with the version control tools, such as Git 2.X, to keep versions attributed from different people and record project at different time points
- Created High level and Low level design documents from Business requirements
- Delivered all the business requests with zero defects.
Environment: Python 3.X, Teradata 14, Hive 0.11, HDFS, Spark 1.4, Git 2.X, SAS SAS/ Enterprise Guide 9.4, Windows, Editplus, Putty
Confidential, Columbus, OH
Sr. Data Analyst/Scientist
Responsibilities:
- Worked collaboratively to develop the scientific methods to predict the claims severity and
- Collected data from different data source with Big Data tools include Spark 1.6, Hive 0.11
- Participated in all phases of analysis including data extraction, data cleaning, feature
- Generated statistical models in production on various business needs by using Python, R
- Selected models implemented include: Linear Regression,Random Forest, XGboost
- Implemented Ensemble Models Bagging and Boosting to enhance the efficiency and
- Designed rich data visualizations to model data into human-readable form with Tableau 9.2
- Used Pyspark and MLlib to implement machine learning algorithms in big data environment,
- Evaluated and promoted the performance of models with Cross Validation
- Used Git 2.X for version control and coordinating with the team
- Participated in determining the criteria for Small Business Marketing campaigns and developed SAS codes to meet those criteria.
Confidential, Columbus, OH
Data Analyst
Responsibilities:
- Developed SAS programs to create a customer mailing list for Direct Mailing and Telemarketing. Used SAS/ODS to produce RTF and HTML reports
- Implemented logistic regression to classify the customers to predict the sale of new product.
- Implemented Linear regression to predict the
- Worked with large customer response and sales data sets and presented results to the users. Simultaneously tested the integrity of the mails, addresses in the database.
- Used proprietary statistics tool to test how the marketing campaigns fared. dis included measuring the sales lift attributable to a specific marketing campaign, comparing similar campaigns across multiple tracking windows etc.
- Developed an automated system for CLAIM REPORTING (HCLMR). Extracted data from Oracle tables and MVS files
- Maintained and enhanced existing SAS reporting programs for Insurance campaigns. Involved in the code review to make sure the output is as we expected and the efficiency is met.
- Designed standardized and automated ad-hoc, for new product development, on-line marketing, direct mail, reward-based marketing, etc.
- Designed and developed code with SAS/EG to support ad-hoc requests.
- Automated the standard reporting with SAS BI and Microsoft Add-in.
- Extensively used Proc Print, Proc Report, and Proc Tabulate for reporting and communicated analysis results to the marketing team. Used other SAS functions such as Proc SQL (queries), Proc Append, Merge, Macros, and Import and Export. Used SQL Pass through Facility to create and update Teradata and Oracle tables.
- Developed queries on the existing Oracle and SQL server databases to provide ad-hoc reports using SAS/SQL for marketing department use. Coordinated analyses involving customer information, campaign response data, and demographics.
- Created Stored Processes for the business to generate different ad-hoc reports
- Has experience creating information maps which would be utilized by web report studio to generate static reports. Perform data validation, edit checks and develop the macros, charts, graphs and pivot tables
Environment: SAS/BASE, SAS/MACROS, SAS/SQL, SAS/GRAPH, SAS/ODS, Excel, Access, UNIX (SunOS), Windows, Exceed, PuTTy, WinSCP, Oracle, SQL Server, SAS Enterprise Guide, SAS Add-In to MS Office, SAS Information Map Studio, SAS Stored Process, and SAS Web Report Studio.
Confidential, Minneapolis, MN
SAS Analyst
Responsibilities
- Post ABEND analysis & probable suggestions for a permanent fix
- 24/7 production support to major applications.
- To understand Business and Functional Specifications for Maintenance and Enhancement requests from the End User/Business Users
- Analysis of programs and Documentation.
- Coding using COBOL, SAS, JCL, ASSEMBLER, DB2 through implementation for requests to add new features to the existing functionality.
- Setting up the SFTP and FTP to send the files to actuaries
- Review of unit test cases, and unit test results
- Extracted, transformed, and loaded data using SAS.
- Extensively used SAS that involved logistic regression.
- Presented process flow diagrams in MS Visio
- Participated in the design and development of databases different data models according to the user specifications in the development of databases for small applications
- Extensively used PROC PRINT, PROC REPORT and PROC TABULATE for reporting.
- Merged the datasets to provide the users with the data in the required form
- Interacted with other team members and lead to discuss the required developments to be made in coding to improve the functionality and effectiveness.
- Created ad-hoc reports using COBOL, SAS, DB2 as per the requirements of the users
- Submitted SQL Queries and SAS codes in UNIX shell to validate data. Used Crontab and Shell scripting to start SAS process, control jobs, and redirected input and output.
Environment: COBOL, SAS, DB2, JCL, DB2, VSAM, FILE-AID, SORT, TSO, ISPF, z/OS, HP Service center, Lotus Notus HP Quality Center, Clarity Time Sheet, Expeditor, Debug Tool, Changeman, Mobius (Document Direct), ESP, Clarity Timesheet
Confidential
SAS Programmer
Responsibilities:
- Created new or modify SAS programs to load data from the source and create study specific datasets, which are used as source datasets for report generating programs.
- Cleaned, validated, and managed various Bank credit card departments’ datasets, and handled missing values.
- Provided pertinent financial and statistical data by interacting with statisticians, processors and IT staff.
- Conducted statistical analyses using SAS/STAT including PROC MIXED, PROC FREQ, PROC, PROC PHREG, etc.
- Developed SAS Macros to generate graphs and reports based on combined datasets, and performed statistical analyses with output delivery procedures.
- Developed numerous ad hoc SAS programs to create summaries and listings.
- Created SAS MACROS and SAS GRAPHICS. Customize the existing programs using SAS Macros as per the statistician’s requirements
- Analyzed different activity metrics and generated reports and graphical representation of these sales for comparison of different drugs using SAS/GRAPH and SAS/STAT.
- Used PROC GPLOT to create graphs in SAS.
- Generated interpretive charts, tables and reports in accordance with regulations including patient demography, discontinuation, and adverse events.
- Participated in producing integrated summaries of safety and efficacy.
- Collaborated with statisticians and medical researchers in preparing formal reports.
- Generated SAS Customized reports using the DATA NULL and PROC REPORT techniques.
- Created TEMPLATES to modify the appearance of the Displayed ODS Tables using PROC TEMPLATE.
- Extensively used PROC FREQ, PROC TABULATE, PROC MEANS, PROC SUMMARY, PROC CONTENTS, PROC COMPARE and PROC UNIVARIATE.
Environment: SAS/ACCESS, SAS/BASE, SAS/STAT, SAS/ODS, SAS/SQL, SAS/GRAPH, MS-Excel.