We provide IT Staff Augmentation Services!

R Developer - Data Scientist Resume

2.00/5 (Submit Your Rating)

SUMMARY

  • Data Scientist - R/Python Developer with hands on 8 years of experience. Excellent skills in Analytical skills, Statistical modeling, Databases,Management and Predictive Analytics, Six Sigma DMAIC Methodology, R (Programming language), Python,SAS, Minitab, SPSS, Advanced MS-Excel, Open office,Java and SQL
  • Expertise and experience on statistical data analysis such as transforming business requirements into analytical models, designing algorithms, and strategic solutions that scales across massive volumes of data.
  • Proficient in Statistical Methods like Regression models, Time series, Prediction models, hypothesis testing, confidence intervals, interpretation of all machine learning models
  • Expert in R and Python scripting. Worked in stats function with Numpy, visualization using Matplotlib and Pandas for organizing data.
  • Experience in using various packages in R and python like ggplot2, caret, dplyr, RShiny, rjson, plyr, SciPy, scikit-learn
  • Extensive experience in Text Analytics, generating data visualizations using R, Python and creating dashboards using tools like Tableau.
  • Experience with data-sets of varying degrees of size and complexity including both structured and unstructured data. Piping and processing massive data-streams in distributed computing environments such as Hadoop to facilitate analysis
  • Experience in writing code in R and Python to manipulate data for data loads, extracts, statistical analysis and modeling
  • Expertise in Banking,Financial,software industry and telecommunication industry areas of Analysis, predictive modeling, forecasting and business intelligence, training and Quantitative project management
  • Professional working experience in Machine Learning algorithms such as, linear regression, logistic regression, Naive Bayes, Decision Trees, Clustering, and Principle Component Analysis.
  • Good understanding and process of all Software lifecycle Methodology.
  • Acquired knowledge in Six Sigma DMAIC Methodology by working in multiple project directly with customer
  • Strong knowledge on CMMI process areas and auditing projects
  • Experienced in writing complex SQL Quires like Stored Procedures, triggers, joints, and Sub quires.
  • Experience in Interpreting problems and providing solutions to business problems using data analysis, data mining, optimization tools, and machine learning techniques and statistics.
  • Work with teams across the locations.
  • Experienced in analyzing & gathering the business and system requirements. Experience in Preparing case study and brought to the improvements in the project
  • Strong written and oral communication skills for giving presentations to non-technical stakeholders.

TECHNICAL SKILLS

Languages: Python (Numpy, Pandas, matplotlib, Scikit-Learn, Tensor flow), R (data.table, dplyr, lubridate, ggplot2, caret), SQL, Transact SQL, Excel VBA, SAS.

Database: MS SQL server, Oracle SQL Developer, MS Access.

Data Visualization Tools: Tableau, Power BI

Big Data technologies: Hadoop,Hive and Pyspark

Agile tools: Jira and confluence

Version control tools: Git, Tortoise Git, Bit Bucket

Analytical and Reporting tools: Excel and Alteryx

PROFESSIONAL EXPERIENCE

Confidential

R Developer - Data Scientist

Responsibilities:

  • Develop and maintain complex credit risk models for PPNR and credit risk models (probability of default, loss given default and exposure at default) and NIE modelling leveraging upon R, Python, R Studio, Oracle SQL, Tableau and other ETL Technologies
  • Obtain a quantitative basis on model modification and execution, leveraging upon mathematics and statistical programming R, Python modules like pandas, NumPy, matplotlib and scikit learn.
  • Performed Exploratory Data Analysis using R. Also involved in generating various graphs and charts for analyzing the data using R and Python Libraries.
  • Perform root cause analysis utilizing credit and market risk modelling techniques
  • Review existing mathematics and statistics methods and respond to IT service requests for Change Impact analysis, business requests for analysis
  • Implement coding changes based on root cause analysis and resolution plans
  • Understand client requirements and build ETL solution using Alteryx jobs to perform all business rules
  • Manage the stress testing platform of scenarios published by DFAST (adverse and severely adverse scenarios)
  • Define and evolve the 14 A Schedules for monthly and yearly reporting using extensive capital planning knowledge.
  • Analyze the Marco-economic variables in feature selection in model development and contribute to architectural decisions at a department and bank-wide level with the use of mathematical techniques.
  • Collaborate with SIT, UAT, and development teams to build implementation plans and strategy
  • Transforming the non-Stannard file formats into standard format consumable by model execution system by using Alteryx designer tools.
  • Complex Data blending with Alteryx tool Extracting data from the source system (DB,Excel files,etc.,)Developing and proper testing of Alteryx workflow .
  • Work closely with customer's, cross-functional teams, research scientists, software developers, and business teams in an Agile/Scrum work environment to drive data model implementations and algorithms into practice.

Environment: R, Alteryx, Java, Python,GITHub,Oracle 11g,SQl,Excel

Confidential

Data Scientist - R/Python

Responsibilities:

  • Part of credit and quantitative team and worked on various Credit Risk Analysis to determine the factors that explain risk factors associated with each customer.
  • To build a Binary logistic regression to classify that a customer will be potentially bad
  • Arrived credit score and credit score assessment based on statistical analysis using historical data
  • Forecast the customer’s transactions using time series analysis using R and Python Libraries
  • Arrived character and capacity model to predict customer capacity to repay based on various capacity and character indicators
  • Worked with datasets of varying degrees of size and complexity including both structured and unstructured data. Piping and processing massive data-streams in distributed computing environments such as Hadoop to facilitate analysis
  • Worked on analyzing Hadoop and responsible for building scalable distributed data solutions using Hadoop
  • Model development, model maintenance, model validation
  • Dealing time-series models & depending on trend, seasonality and cyclicity ARIMA
  • Running predictive model on banking data, review the quotes and decision making for either requested amount and quantitative information.
  • Giving presentation to the management and Project team members on models and guiding project team on maintenance and forecast the model
  • Providing training to the management and Project team members on statistical concepts and models.

Environment: R,Python,Java,hadoop,Oracle 11g,SQL,PostgresSQL,Excel

Confidential

Data Scientist -R /Python

Responsibilities:

  • Carrying out specified data processing and statistical techniques such as sampling techniques, estimation, hypothesis testing, time series, correlation and regression analysis Using R.
  • Applied various data mining techniques: Linear Regression & Logistic Regression, classification, clustering.
  • Worked on collection of large sets using Python scripting, SparkSQL
  • Understand the existing ETL (Extract, Transform and load) workflows and assist in building workflow
  • Responsible for building scalable distributed data solutions using Hadoop
  • Involved in loading data from UNIX file system to HDFS
  • Performed data analysis and data profiling using complex SQL queries on various sources systems including Oracle 10g/11g and SQL Server 2012.
  • Arrived independent variables with financial ratios for modelling and forecasting the data using time series analysis -R and Python
  • Arrived character and capacity model to predict customer capacity to repay based on various capacity and character indicators
  • Model development, model maintenance and model validation

Environment: R,Python, Java,hadoop, unix,Oracle 11g,SQL,PostgresSQL,Excel

We'd love your feedback!