Model Validation Consultant Resume
OH
SUMMARY
- Has over six (6) years of experience in Quantitative Research and Analysis, hands - on expert in Data Scientist, Machine Learning Algorithms, Stochastic Process and Modeling, Data Mining, Model development, Validation and Scoring / Projections in R, Python and SAS environment.
- Domain knowledge and experience in PPNR model development (me.e. panel, time-series and cross-sectional regression-based models for projections under variousmacroeconomic scenarios: baseline, severe and severely adverse) and model validation, advanced statistical test of model assumptions, Retail/ Wholesale credit risk modeling( PD, LGD, EAD), VaR, stress-testing etc. under CCAR framework and general financial instruments and derivatives.
- Domain experience and expertise in developing and executing Standard SQL queries and T-SQL stored procedure. Also has good understanding of data architecture and databases.
- Expert and deep domain experience in supervised and unsupervised machine learning algorithms like Random Forest, Gradient Boosting, Elastic Net, Support Vector Machine(SVM), Clustering techniques, K-Means, K-NN, TEMPPrincipal Component Analysis(PCA), logistic regression(with L1 &L2 penalties) etc., and statistical regression methods / techniques like OLs, GLM, GLS, GEE, Survival Analysis, (Nested) Mixed TEMPEffect Models, Generalized logit models with random and fixed TEMPeffects and test of model assumptions in R, Python and SAS.
- Advanced knowledge and experience in time-series/panel regression modeling encompassing Exponential Smoothing, AR(p), MA(q), ARCH(p,q) and GARCH(p,q), co-integration, test of stationarity etc.; Bayesian Statistical inferences and modeling encompassing parameter estimation and MCMC simulation techniques like Gibbs Sampling, Hastings-Metropolis.
- Practical experience with web development in HTML, CSS, JSON environment; web scrapping with Python BeautifulSoup module, RE module; sentimental analysis/text mining and data visualization with ggplot, RShiny and d3.js etc.
- Domain knowledge working on structured and unstructured data with Hadoop (HDFS and MapReduce) and Hadoop ecosystem encompassing Impala, Oozie etc.; and using conventional RDBM like MS-SQL. Also running of R and Python MapReduce jobs through Hadoop Streaming for predictive analytics and cloud computing.
- Experience with cloud computing infrastructure (e.g. Amazon Web Services EC2, Elastic MapReduce) and consideration for scalable, distributed Systems.
- Draws on experience in all aspects of analytics/data warehousing solutions (Database issues, Data modeling, Data mapping, ETL Development, metadata management, data migration and reporting solutions).
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS (Hadoop Distributed File System) and MapReduce, Hive, Spark, HUE
Analytical/Modeling Software: R/Rstudio, Python(PyCham/IPython)PROC SQL, SAS, SPSS, Mat lab
Cloud Computing Infrastructure: Amazon Web Services (AWS) EC2Programming language and operating systems
Python, C++, windows family, familiarity with LINUX.Computational Tools & Management Tools: MS-Excel, Macros & VBA. Power point, MS Outlook, MS Project and Share point, XML
BI Tools: Spotfire Cloud
Relational Database Management System: MS SQL Server, Oracle, Terradata, Netezza, Aginity Workbench, MS Access
Version Control: Tortoise SVN, Git
Web Development: HTML, CSS, JSON, JavaScript
PROFESSIONAL EXPERIENCE
Confidential - OH
Model Validation Consultant
Responsibilities:
- Check the completeness, accuracy and appropriateness of the data used in the model development. Validate model’s input data against primary sources and verify that all relevant drivers has been gathered and collect to meet the model’s core purpose.
- Evaluate whether the selection and structure of variables is consistent with similar industry models and modeling objective.
- Assess and test assumptions underlying Probability of Default(PD) and regression models covering autocorrelation, multi-collinearity, heteroskedasticity, stationarity test, normality, linearity/ model specification etc.
- Perform independent conceptual and theoretical review, benchmarking, independent implementation of the models employed.
- Quantifying model risk and reporting of the findings.
- Close interaction and collaboration with the model developers in the First Line-of-Defense.
- Evaluate robustness and stability of models; performs sensitivity analysis, replicating and benchmarking
- Perform analysis to assess models’ performance andmodeluncertainty
Environment: (SAS/BASE, SAS/MACRO, SAS/STAT, SAS/ETS, SAS/ODS, SAS/PROC SQL), MS SQL, SAS Enterprise Guide
Confidential - MA
Statistician/SAS Analyst Consultant
Responsibilities:
- Design and develop statistical and data management programs/applications that support and analyze complex healthcare delivery (e.g., Aetna claims, MarketSca healthcare claims, CMS claims and other claims-based or electronic medical records databases) systems, healthcare payment systems as well as associated administrative systems in SAS environment.
- Management and analysis of large administrative data from health insurers. Ensure integrity of data collection, data review, data compilation, and analysis techniques.
- Create analytical files to be used for analysis related to cost, utilization rates, quality of administrative data.
- Work with Program Managers and other team members to analyze and interpret research data to halp create appropriate algorithms and programs and work independently to implement appropriate analysis and provide accurate documentation.
- Responsible for successful completion of all analytical and data related duties and maintain timelines and determine proper summary statistics, report formats and all other analysis considerations.
- Develop complex multi-level or nested mixed TEMPeffect models and generalized logit models and performing partition and analysis of variance components at different levels of nested random TEMPeffects in SAS environment.
- Re-purposing and execution of AHRQ SAS programs on Hospital Discharge or Inpatient Claims data to generate quality measures or indicators at different levels of stratifications: Tax Identification Number (TINs), Accountable Care Organization (ACOs) etc. Geocoding and reverse geocoding in python environment.
Environment: (SAS/BASE, SAS/MACRO, SAS/STAT, SAS/ETS, SAS/ODS, SAS/PROC SQL), MS SQL, Unix, SAS Enterprise Guide, SAS Enterprise Miner, R\RStudio. Python
Confidential, New York City -NY
Data Scientist Consultant
Responsibilities:
- Technical documentation and validation of Credit Risk/Loss models and Pre-Provisional Net Revenue (PPNR) models as part of Comprehensive Capital Analysis and Review (CCAR) implementation. dis technical documentation and validation covers the whole sale Probability of Default (PD), Loss Given Default (LGD), Expected Loss (EL) and other regression model for revenue forecast or projections.
- Review of white-papers on constructed PPNR regression-based models for projections and Whole-sale credit risk modeling to aid in productionizing these models.
- Implementation of models to comply with CCAR architecture and Basel me, II, III regulations
- Implement predictive models in Python and R environment as atoms (executable unit of code which takes read data from table and output model object to be stored in a table) for these models for onward deployment in Model Manager for execution. Leveraged advanced econometric modeling techniques (panel and time-series regression etc.) to build the atoms and also test model assumptions (including Gauss-Markov Assumptions) to ensure robustness.
- Build and implement the production-ready codes or atoms with APIs or R packages created for internal or in-house use for model development.
- Ensure the model implementation is aligned with the model’s purpose and its underlying analytical approach including a thorough analysis of input data, review of the underlying code used in implementation and model performance.
- Performs independent model validations. dis involves assessing the model’s overall suitability to its intended purpose, evaluating the model’s mathematical and statistical theory
Environment: R/RStudio, Python/PyCham, Hadoop, SQL Developer,Tortoise SVN, Linux, Sharepoint.
Confidential, Woburn MA
Data Scientist Contractor
Responsibilities:
- Develop complex SQL codes or queries based on requirements of (medical) edits on claims with potential dollar savings for company clients like HIP and United Health Service (UHS). The requirements in the edits specifies procedure and diagnosis codes and other NCCI regulations for which administered claims are denied or billed.
- Developed stored procedures for good edits with high dollar savings using the codes written to extract claims with relevant and potential high dollar savings. With the appropriate input and output parameters specified, the stored procedures are run against the company databases like COSMOS and HIP to obtain the claims with possible dollar savings.
- Conduct grid search, tuning of model parameters of many classifiers and variable selections with robust methodologies as part of model development. Also performs data cleansing, transformation and data quality checks prior to model development.
- Built predictive models in R environment to determine whether DRG claims are audited or not and also to identify whether a given provider facility is paid for some or all of their healthcare insurance claims according to DRG, case rate or Per Diem contract terms. Leveraged machines learning algorithms like Random Forest(RF), Neural Networks(NN), Elastic Net(EN), Stochastic Gradient Boosting, Partial Least Squares Linear Discriminant Analysis (PLS-DA) and Regularized Logistic Regression to build predictive models on 3.5million healthcare data with over sixty features/variables to decide whether facility claim type TEMPhas been paid or not.
- Managing and creating EC2 instances on Amazon Web Services (AWS) and hosting, maintaining and scaling up of memory capacity of Rstudio on AWS so that Rstudio can run advanced modes on big data without any memory limitation issues
Environment: R, Python, SQL Server, Amazon Web Services (AWS), MS Excel.
Confidential, MA
R Statistical Consultant
Responsibilities:
- Providing deep data mining skills and techniques to generate association rules among variety of Dunkin brands for the project. dis halps the company in product placement and cross sell marketing campaign to boost revenue and profitability using R and Oracle.
- Coding of the apriori algorithm of Market Basket Analysis using PL/SQL in Oracle environment.
- Manipulate POS data and utilize scripts run on Oracle to extract transaction data on Dunkin brands, clean the data and transform into a form for sample market basket analysis using arules package in R to generate support, confidence, lift and association rules among brands.
Environment: R Studio, Oracle PL/SQL, Oracle OBIE, Oracle DB, Oracle Client tools, OBIE
Confidential, CO
Data Scientist /R Programmer
Responsibilities:
- Built response model for direct marketing campaign using PROC LOGISTIC in SAS and with both forward and backward stepwise regression for variable selection (over hundreds of variables) at STL = 0.001 and SLS=0.001. The model was built on training set (constituting 70% of the data) and validated on validation/test set (constituting 30% of the data) and reported 78.6% Area Under Curve (AUC) of Relative Operating Characteristic(ROC) curve as against the random 50%. dis same model was replicated in R environment to compare across other models using advanced methods like Elastic Net with binomial as link function and support vector machine.
- Scored the model on new data and also used PROC RANK or block of SAS MACRO to partition the predicted outcome or responses into deciles for TEMPeffective targeting or direct marketing campaign.
Environment: R Studio, Base SAS, SAS Enterprise Guide, Python (NumPy,SciPy), MS Excel, MS Word and Outlook, Hadoop, Netezza, Teradata,), MS SQL
Confidentia
Statistical Research Analyst
Responsibilities:
- Cleaned and entered data collected from field work into software packages like SPSS, SAS and utilized statistical analytic tools to provide descriptive and analytical reports of the survey.
- Prepared quarterly progress reports and weekly status update for the analysis of the data utilizing MS office package and Visio heavily.
- Utilized multivariate statistical techniques like MANOVA, factor analysis, cluster analysis and discriminant analysis to thoroughly transform the data and to conduct complex hypothesis most at five (5) percent level of significance using SPSS or SAS.
- Utilized formulations of statistical methods to investigate hypothesis and conduct of independent analysis requiring formal statistical test using, SAS, SPSS /R. Through dis process we were able to estimate the relevant sample sizes of the respondents to be interviewed for the survey.
Environment: MS Visio, MS Office, SPSS, Excel, Baser SAS, SAS Enterprise Miner, SAS
Confidential
Trainee Accountant
Responsibilities:
- Performed variance analysis, cost estimates, budget standard update, journal entries and account reconciliation.
- Completed the preparation, analysis and interpretation of financial statement and also reconciliation of bank statements on monthly, quarterly and annual basis.
- Educated clients especially those in small scale enterprises on basic accounting methodologies, proper book keeping and other accounting related activities.
Environment: Excel, XML, MS Word, MS Visio.