We provide IT Staff Augmentation Services!

Data Scientist Resume

0/5 (Submit Your Rating)

Seattle, WA

SUMMARY

  • 7+ years of Experience as a Senior Advanced SAS Analytics and Statistical Modeler with extensive experience in building and/or supporting models in banking/finance, retail, insurance and entertainment industries.
  • Data preparation for various statistical modelling, which includes data cleansing, descriptive statistics, missing data analysis, data validation and preliminary data reporting.
  • Expertise in building Analytical and Statistical Models for database marketing, risk management groups using SAS, R, Python.
  • Strong expertise in analytical and quantitative techniques including predictive modeling and multivariate analysis.
  • Experience working with large data sets across digital, social and email channels to derive actionable insights.
  • Linux Shell Scripting (Korn) to create shell scripts for batch running of SAS programs
  • Skilled in using Big Data technologies like Hadoop, HDFS and Hive.
  • Sound statistical knowledge to infer valid conclusions from volumes of data.
  • Extensive experience in preparation of reports, tables, listing and graphs.
  • Proficiency in Time Series and Forecasting techniques using SAS Enterprise Miner.
  • Strong knowledge in Classification models and advanced statistical and mathematical modelling and machine learning techniques.
  • Expertise in automation of SAS and R processes, models and reports.
  • Experience in quantitative analysis and research, data mining, aggregation and validation, model development, scoring and validation of predictive models including financial time series models using SAS and R.
  • Have experience in doing excellent documentation on business requirements
  • Strong experience with databases like Oracle 9i/8i, MS SQL Server 2008, DB2, and MS Access.
  • Extensive experience to handle Large Teradata for Data Cleansing, Data Profiling and Data Scrubbing.
  • Involved in coding and pulling data from various oracle tables using Unions and Joins.
  • Extensive knowledge of advanced procedures including Multivariate Analysis, Regression, ANOVA, Graph and Plots
  • Hands on experience in SAS, R and Hadoop/ HIVE programming for extracting data from Flat files, Excel spreadsheets and external RDBMS (ORACLE) tables
  • Possess a strong ability to adapt and learn new technologies and new business lines rapidly
  • Effective team player with strong communication & interpersonal skills.
  • Writes and runs complex SQL scripts on ODBC, Netezza and Teradata database servers to extract records on terabyte and petabyte scale for analysis and modeling.
  • Utilizes advanced supervised and unsupervised learning methods of machine learning algorithm and statistical techniques like k - means clustering, principal component analysis (PCA), regression and visualization techniques for pattern recognition
  • Conducts coding of qualitative variables and manipulation, complex hypothesis testing and statistical analysis through various statistical methodologies like experimental designs (ANOVA with or without replication, factorial design, ANCOVA etc), MANOVA, discriminant and factor analysis and statistical inferences in R and SAS.

TECHNICAL SKILLS

Statistical software: Base SAS, SAS/SQL, SAS/Macros, SAS/ODS, SAS/CONNECT, SAS Enterprise Miner, SAS Enterprise Guide, Interactive Matrix Language (IML)

Languages: SAS, SQL, Python, R

Tools: Hadoop, HDFS, Hive, FICO Blaze Advisor, XML, SQL Advantage, Tableau, Qlikview, Vb-scripting, Microstrategy, Cognos, WinSCP, Putty

Databases: MS Access, DB2, Confidential SQL Server, Teradata, Oracle

Software Packages: Confidential Office 2010 (word, Excel, PowerPoint)

PROFESSIONAL EXPERIENCE

Confidential, Seattle, WA

Data Scientist

Responsibilities:

  • Develop methodologies and approaches to investigate relationships between media, customer engagement & sales/purchase outcomes using R & SAS
  • Develop statistical models on marketing mix, customer lifetime value, prediction of product sales, understanding of customer purchase intent, etc.
  • Develop statistical models and machine learning techniques in R language on the email and social data
  • Import and integrate data providers from multiple data warehouse infrastructure to address Ad hoc Queries to view data in terms of tables and joins from Hadoop distributed file system (HDFS) using Hive QL.
  • Develop Python and/or R code to execute the intended models and identify the outcomes
  • Develop closed-loop approaches to develop and test recommendations
  • Create analytical templates (metrics, segments, approaches, formula, framework)
  • Develop insights and recommendations based on model outputs
  • Advance the discipline of marketing/media mix modeling

Confidential, NYC

Data Scientist

Responsibilities:

  • Designing, developing, testing, deploying, and maintaining analytic applications and data structures needed for modeling and forecasting
  • Leveraging efficient coding techniques to optimize / improve performance
  • Implement statistical, economic, econometric or other mathematical models for Chase’s CCAR team (consumer bank )
  • Built probability of default models in R and SAS languages
  • Use Unix operating system with Korn shell scripting
  • Used big data technology of Apache Hadoop to pull and join the data from the database, which improved the efficiency
  • Work with Enterprise Data Warehouse (ICDW) staff members to map variables from their source systems to tables and data sets used by the modeling team and to create necessary meta data to document sources and field definitions
  • Responsible for all aspects of data management for the modeling team
  • Assist modeling team members with questions surrounding sources of data files and definitions of data fields
  • Process analysis and process improvement
  • Creating technical and user documentation
  • Identifying innovative techniques and/or developing utilities that increase the speed and efficiency of the modeling tool set in Python/ R/ SAS language
  • Maintaining data and code from different development environments (development to test to production)
  • Efficiently handling and managing massive volumes of data (large Datasets with upwards of tens of millions of records and/or terabytes of data storage)
  • Lead and/or actively participate in SAS/ Python/ R knowledge-sharing activities
  • Understand existing coding and processing to troubleshoot and resolved issues as they arise

Confidential, IL

SAS Modeler / Data Scientist

Responsibilities:

  • Big Data Mining for customer acquisition and customer retention
  • Create predictive models based on 200,000,000+ grocery store Point of Sale data to target new customer and reduce customer defection using R & SAS
  • Create Hadoop/ Hive data platform to have the daily web log files collected from 350+ servers, and query it daily through HIVE QL like language
  • Worked on the email and social data to find customer insights
  • Replace daily aggregation of data generated through Hadoop/Hive instead of MySQL
  • Participate in design and implementation of Data Mining procedures for both new predictive modeling techniques and improvements in processing efficiencies
  • Provide complete solutions to business problems using data analysis, data mining techniques and statistics like chi square, correlation, logistic regression and decision tree technique
  • Construct modules in SAS/IML to pass arguments to R
  • Call the functions in R using SAS/IML
  • Used Korn shell scripts for creating batch reports
  • Transfer data to and from the SAS/IML and the R Interface
  • Score full database of 200,000,000+ IDs using the desired predictor variables
  • Improved redemption rate of Data Mining print programs by 3% -4% or $700,000 on average
  • Communicate details of models and projects to the Sales and clients

Confidential, MI

Data Scientist / Statistical Modeler

Responsibilities:

  • Collaborated with multiple departments for their analytical needs in multiproduct environment in areas such as Database Marketing, Segmented Pricing and Attrition Prediction.
  • Create predictive models using logistic regression or decision trees in R and Python
  • Developing frame works in collaboration with Statisticians, Business teams and Programmers for strategic and tactical needs through experimental designs.
  • Build Custom reports through queries in Hadoop/Hive
  • Used Hadoop for processing large amounts of unstructured or semi-structured data
  • Used XML to express the relationships between forms and use this information to control both the user interface and a validating application
  • Used Mathematical Markup Language, MathML, from XML to embed mathematical and scientific equations in Web pagess
  • Developed R programs to create a customer mailing list for Direct Mailing and Telemarketing.
  • Extensive Experience in writing shell scripts for executing the batch files on UNIX environment to save the execution time and for automation.
  • Developed, modified, and generated Daily Wells Owned Ownership and Monthly Wells Owned Loan Level reports having 50+ Million records data querying from Oracle using SQL Pass-thru summarizing business activity and created financial data sets using DATA steps and DATA NULL .
  • Used Korn shell scripts for creating batch reports
  • Effectively prepared and published various performance reports and presentations
  • Created new datasets from raw data files using Import Techniques and modified existing datasets using Set, Merge, Sort, Update, and conditional statements.
  • Involve in Unix Shell programming using Bash.
  • Maintained and enhanced existing R programs for marketing campaigns.
  • Conducted significance tests and study response rates for different offers.

Confidential, IL

Data Scientist/SAS Modeler

Responsibilities:

  • Worked as a SAS programmer for marketing targeting solutions- fulfilment team.
  • Coordinated processes related to marketing campaign eligibility.
  • Participated in campaign setup team meetings.
  • Prepared audit statistics on providers and policyholders using various fraud detection techniques.
  • Wrote Korn-Shell Scripts performing various automation steps like running a SAS code, emailing the output files, removing and copying files, archiving files for historical data and zipping up files to send to external locations as attachments in emails.
  • Generated reports and analysed on aggregate claims statistics.
  • Developed predictive models to detect anomalies in claims.
  • Extracted data from different sources like claims data mart and text files using SAS/Access, SAS SQL procedures and created SAS datasets.
  • Performed data preparation and transformation using R and SAS procedures to ensure data quality and consistency.
  • Generated reports on providers such as total amount billed, per-subject billing amounts etc. for auditors and investigators.

Confidential, Naperville IL

Data Scientist/ SAS Pricing Strategy Analyst

Responsibilities:

  • Conduct and interpret macro and micro pricing modeling and scenarios, evaluate post pricing analytics for item pricing across categories and price zones along with detailed analysis to determine current and projected profitability
  • Generate advance level analysis and reports to estimate the impact of price changes on the items sales, volumes and profitability
  • Wrote Korn-shell scripts to FTP files to the specific path in the UNIX environment and to backup SAS files
  • Collaborate with the Pricing Strategy and Analytics Manager in making product pricing change recommendations to merchant leadership teams
  • Provide analytical pricing support by preparing to merchants and planning teams
  • Write and understand SQL queries to mine the data and identify trends

Confidential, Chicago IL

Statistical Analyst/Modeler

Responsibilities:

  • Collecting internal and external data for segmentation development, testing, roll out, and performance monitoring
  • Developing consumer segmentation framework, via predictive modeling (eg. K-means, factor analysis)
  • Extensively worked on Korn Shell Scripts running certain daily jobs to modify them as per new ETL standards and process flow changes
  • Application of other relevant analytical procedures with the ultimate goal of identifying -unique consumer segments with respect to business objectives
  • Partner with business stakeholders to develop appropriate strategy for each consumer segment based on their value, demographics, needs/attitudes, and other relevant characteristics
  • Improving holistic understanding of consumers and members in order to continue enhancing our ability to provide proper products, services, and engage consumers leveraging an array of media assets
  • Applying off the shelf segmentation product with internal data, including internally developed segments, apply it to successfully solve business problems, implement, and measure success
  • Provide performance monitoring for each segment based on marketing performance, product performance, user experience, and other applicable KPIs

We'd love your feedback!