Machine Learning And Data Engineer Resume
VA
SUMMARY
- Having over 17 years of experience in Data Science & Machine Learning related to areas for Risk Management & Statistical Analysis.
- Full working experience years of experience in areas including Statistical Analysis, Risk Management, Machine Learning, Predictive Modelling, Data analysis, proofing and visualization & mining with large data sets of structured and unstructured data.
- Python programming (8+ years) with 3+ years in Machine Learning in Python (IDE Spyder/Jupyter Notebook). Also using deep learning techniques in the ML spectrum for image recognition and natural language processing (Scipy, Pytorch, Tensorflow, Theano, NTLK, Keras, etc.).
- 10+ years of experience programming & modelling in SAS Base, Enterprise Guide, Visual Analytics and SAS Miner.
- 10+ years of experience in statistical analysis, data mining, data analytics, data manipulation and structure creation for statistical modelling (with ETL processes and Data - Mart design and deployment).
- 10+ years deploying data visualizations with SAS, Python, Excel, Tableau & PowerBI.
- Creation of solutions based in the translation business requirements by data mining, development of analytical models, algorithms, building ad-hoc solutions, reporting processes which may scale up to massive volumes of structured and unstructured data.
- Extensive experience in Data Analysis and modelling, Data Wrangling & Profiling, Data Integration, Migration & Governance, with metadata and semantic modelling.
- Broad experience in creating data structures and produce predictive analytics modelling.
- Developed and Deploy ML supervised models and some for unsupervised learning including deep learning. High performer while critical, analytical, logical thinking is required to face situations and problems.
- Comprehensive theoretical knowledge and skills for simulating and modelling of probability distributions, stochastic processes and advanced statistical models. Knowledge and working experience with design of experiments: of sample designing and sampling techniques.
- Out conventional established solutions poser and explorer thinker. Highly developed abstract thought useful to find fresh solutions on posed problems or challenges. Experience analyzing and redesigning processes to reduce costs and increase productivity. Capable of working independently as well as in a teamwork environment.
- Use of different DB technologies: SQL, Oracle, MySQL, SAS, MongoDB.
- Azure Databricks
- Data Science with focus Analytics, Statistics, and Machine Learning
- Data Analytics & Data Science (Python, Spark, Scala, & SAS)
- ML and Deep Learning with TF, Pytorch, Scikit, Theano, etc.
- SQL/NOSQL data bases and structures
- Statistical Analysis, predictive Analytics, stochastic modelling.
- Data Cleansing, data quality assurance.
- Data wrangling or munging.
- Data mining and ingestion, Extract, Transformation & Load
- Business Intelligence, Dashboards and Reporting Skills.
- Strategic Planning & Execution
- Requirement Gathering & Analysis
- Project Planning & Execution
- Relationship Management
- Team Leadership
TECHNICAL SKILLS
Data Analytics/Data Science: Predictive Analytics, Data Modelling, Data Cleansing, Data Visualization, Data Mining Machine learning models for predicting the evolution (growth, sustainability or decay) for social networks or price clubs.
Business Intelligence & ETLs (10+ years): Tableau, Qlik View, PowerBI, MatplotLib, SAS Visual Analytics
Traditional Data Bases: MS-SQL server, SAS SQL, PostgreSQL, MYSQLWeb Analytics Microsoft Azure, Google Cloud Services.Analytics, Statistic, Statistical Modelling & ML Python, pyspark, scala, R, SAS Miner, STATA, SPSS, JPM, Clementine, Weka, Eviews, H2O.ai
Domain Analytics: Insurance, Finance, Banking, Telecommunications, Supply Chain Management, and Public Services verticals.
PROFESSIONAL EXPERIENCE
Confidential, VA
Machine Learning and Data Engineer
Responsibilities:
- Support ML models that to prevent sim swap and port-out Fraudulent Transactions
- Provide EDA analysis for new feature generation.
- Develop geolocation algorithms for fraud analytics, and to tag suspicious transactions (probable fraud).
- Enhanced data collection pipeline along data cleansing and data wrangling for analytics and modelling.
- Monitor features to prevent model decay.
- Retain ML models as required with the new generated features.
- Support diverse analysis and EDAs demanded by business areas. That requires NLP sentiment analysis, or generate automated reports
Environment: Machine Learning in Azure Databricks (TensorFlow, H2O), Python, pyspark & scala, Databricks. IDEs: Databricks Notebook / Spyder / PyCharm. BA tools: PowerBI. DB tools: Snowflake, Delta Tables, MongoDB.
Confidential, Ashburn VA
Senior Data scientist and Machine Learning
Responsibilities:
- Built a pricing model to suggest rates and discounts to corporate customers using historical data and deep learning algorithms (Smart Pricing Project).
- Enhanced data collection procedures along data cleansing and data wrangling for analytics and modelling.
- Helped built segments of clients that are prone and unlikely to upgrade services for pushing sales. Automated a gathering process to obtain feedback out from sales force and corporate clients.
- Presented findings and recommendations to directors about main results achieved.
- Fulfilled all data science duties as demanded by the end client.
- Data Engineering by automating data storage with the python processes that collect inputs of deals of services and products of Confidential corporate customers (including financial parameters, prices, costs and diverse elements to measure the profitability of a deal along details of the infrastructure and devises deployed to the client(s) site(s)). Created the processes for store the new data into structured databases in Oracle using python for the feed processes. Generate python models that assist calculations of business users and to persist those on the database.
- Support in the digitization of customer files, using open-source OCR tools as those in OpenCV python library (table extraction). The CCRD project has been winning implementation.
- Using NLP (spaCy/NLTK) interpreting and correctly the table mappings contained in the contracts. Research sentiment analysis over contract terms to generate propensity models for assessing attrition and renewals.
- Support the administration of a MongoDB that is used to store pandas that will help to populate xlsx files form dictionaries stored in the MongoDB.
Environment: Machine Learning in Azure Databricks (TensorFlow, H2O, Pytorch), Python, pyspark & scala, MS Access, Excel, and Oracle SQL, Jupyter Notebook / Spyder / PyCharm.
Confidential
Senior Machine Learning Consultant
Responsibilities:
- Build optimized predictive models using machine learning algorithms.
- Enhance data collection procedures along data cleansing and data wrangling for analytics and modelling.
- Simulation algorithms for complex system visualization.
- Present findings and recommendations to stakeholders to improve the overall credit risk management framework and strategy.
- Built sedimentation utilizing both internal bank behavioral scores and external bureau attributes to differentiate customer base.
- Fulfilled all data science duties for consulting clients.
- For Confidential as client:
- Implemented and designed the architecture of the Credit Risk Data-mart.
- Developed complex SAS Macros to simplify SAS code and effectively reduce coding time, cost and increase productivity.
- Build forecast models to provide simulated scenarios of reserves for credit given of changes in the bestowing policies.
- Sensitivity analysis - if lower/increase risk criteria/score cuts, what will be the impact to the P&L, pre/post treatment analysis, cohort analytics
- Employ ML techniques segmenting and risk profiling customers database by employing internal scores and external bureau attributes.
- Critical evaluation of the credit risk models in production, built the reporting process for monitoring.
- Proposed and developed a model for reinforcement learning modeling the experience of credit line utilization and comparing styles if credit handling and aiming into work around stimuli that help the client to keep using its credit while maintaining high user satisfactory experience.
- Deep dive review concepts and program coding of new models to be moved in production.
- Create rich detailed visualization of retail credit products in Tableau and PowerBI dashboards to visualize the and measure credit origination and credit portfolio behaviour shifts and trends.
- Built machine learning models to modify interest rates in a risk-based-price framework (in pre-production environment).
- Built machine learning model to make segments for score card assignment (in production)
- Built machine learning model for calculating a probability of payment (still under pilot testing H20.ai)
- Fulfilled data analyst and scientist duties for credit risk management area.
Environment: SAS BASE, ETS, SQL, STAT (HP procedures), Connect, Visual Analytics, SAS Miner, Machine Learning, Statistical techniques, SQL Server, ORACLE, Python (sklearn, TensorFlow keras, tensorflow hub, pandas, seaborn, & others, IDE Jupyter Notebook / Spyder), Tableau, PowerBI, Microsoft Azure Databricks pyspark, H2O.
Confidential
Data Scientist, Risk Management and Machine Learning Consultant
Responsibilities:
- Data wrangling for scorecard development (Python).
- Score card development for consumer credit (Python), the focus of the client was to bestow student loans.
- Ingested speech data to generate stored in GridFS MongoDB.
- Processed datasets for NLP aiming speech recognition with TensorFlow & NLTK.
- Processed/analyzed processed outputs to evaluate usability for text classification of the generated captions for assessing a re-evaluation of the credit decision with the recorded in credit interviews of the existing process.
- Employed several built-in techniques RRN, STLTM, CNN, Reinforced Learning.
Environment: Machine Learning, Statistical techniques, Python: PyTorch/scikit-learn (pre-processing, model.fit, cross validation, metrics) / TensorFlow, numpy, pandas, matplotlib, PyMongo.
Confidential, New York NY
Data Analytics and Data Science/ Data Engineering & Financial Risk Modelling
Responsibilities:
- Automation of the Investment Income from rarely to daily production with projection to end of month.
- Automation of the portfolio performance attribution from rarely to daily production.
- Automation of reporting services for recurrent committees in Python and Tableau.
- Creation of a model for Asset Liability Management for the portfolios of each business line, it incorporated the simulation of pre-trading analysis of securities.
- Implemented the market risk models for Confidential in Mexico and supported implantation for LAM.
- Assisted the Basel II internal model for FINMA regulator at Confidential in Switzerland.
- Supported the process for forecasting the claims to make projections of the cash flows and liquidity necessary for covering the payments of the customers under Confidential Life products, and Confidential -Santander JV Mexico and LATAM, under the SLA.
- During 2015 under a special program of exchange, I worked in our Corporate Centre at Confidential Switzerland validating with Python, SAS and Matlab the Solvency II internal model implementation.
Environment: Data Visualization, Machine Learning, Statistical techniques, Tableau, SAS Base, Python (pandas, connection to microsoftjet engine, Excel, ETL, sklearn, scikit, Mathlab, Weka, EViews).
Confidential
Data analytics & Risk Management Consultant
Responsibilities:
- Designed data architecture to feed the credit risk Data-mart, programmed the Stored Procedures in SAS and SQL.
- Improve and implement measures and indicators for ensuring adequate control and monitoring of intermediary institutions, in order to establish mechanisms to mitigate the loans losses.
- Modeled the probabilities of default with logistic-like regression, and fit gamma distributions to calculate the estimated recoveries and calculate the expected shortfall.
- Developed in Python a deployable application to feed with different credit products' data the program generated using SVM & cluster analysis to characterize high-risk segments of credit portfolio.
- Analyse the bestowing of new credits and used historical information to fit time series models (ARIMA) to forecast credit reserves behaviour.
Environment: Machine Learning, Statistical techniques, SAS BASE, ETS, SQL STAT, Connect, Visual Analytics, SAS Miner, Machine Learning, IBM SPSS/Clementine, SQL Server, ORACLE, Python (sklearn, tensorflow keras kaggle, tensorflow hub, pandas, seaborn), MS Office application development with redistributable version of Excel and Access.
Confidential, Cincinnati OH
Head of Statistical Modelling and Research
Responsibilities:
- Serving customers to establish from a statistical perspective the proper attention to their information needs and knowledge.
- Advising account managers to provide accurate and timely interpretation of the statistical analysis delivered to final customers in their reports.
- Application of statistical modelling and machine learning. Sampling, Multivariate Analysis: factorial, Fisher discriminant, cluster classifiers, CHAID and other segmentation techniques, structural equations, generalized linear models, etc.
- Python programming of models for time series forecasts, conjoint analysis, price sensitivity and customer satisfaction with diverse applications.
- SAS programmed a fitting model decision tree modelling bootstrapping on survey collected data for calculate price elasticity behaviour and to forecast unobserved price elasticity sensibility (quite similar to a random forest technique, something unknown to me by then).
- Programmed in Python a Kalman filter model to smooth the behaviour of through-time observations and detect anomalous data fluctuation of survey collected data.
- Built designs and sampling frames that help decrease error rates and provide greater reliability to the field results.
- Designed and programmed in SAS statistical combinatorial models to calculate market reach, my idea exploited an efficient use of the SAS Data-step processing and tricked it to generate a efficient way the product line combinations beating other R, Matlab and Excel implementations and producing outputs in hours and not in days as usually.
- Construction of clinical data warehouse and maintenance/update clinical trial patients.
- Used SAS Enterprise Guide to access clinical data, import in SAS format, build predictive modelling, and produced regular reports.
- Prepared survey data to collect simple information for health patients
- Predictive modelling using longitudinal data to monitor HIC health patients and reporting.
- Sample size calculation, writing statistical analysis plan, writing stats part of protocol, performing randomization and hypothetical testing.
- Expertise in design and analysis of all areas of clinical development from basic R&D through translational science, and all phases of global clinical trials, dealing with global regulatory agencies FDA.
- Designed a program in Python a data imputation model based on with conditional probability and stochastic assignation for filling missing data.
Environment: Machine Learning, Statistical techniques, SAS BASE, ETS, STAT (HP procedures), Connect, Visual Analytics, SAS Miner, SQL Server, ORACLE, Python (sklearn, tensorflow keras kaggle, tensorflow hub, pandas, seaborn), R, Matlab, IBM SPSS (or PASW).
Confidential, Columbus, GA
Statistics and Risk Modelling manager and developer
Responsibilities:
- Risk Management Build knowledge of the loan portfolio and trends.
- Developed and kept track of credit origination models.
- Monitor and devise new strategies for portfolio management models in TRIAD (this a Fair Isaac Corporation product, mainly, for managing credit lines, authorizations in transactions, collection queues).
- Provide statistical support to other areas that needed like CRM.
- Designed data-mart for risk and CRM analyses.
- Selected and programmed the models for calculate, under Basel II Accord, expected loss and Monte Carlo simulations to do stress testing.
- Designed all champion-challenger tests for origination and portfolio management.
- Generated various reports to understand the contribution to the risk of the various sales channels, by provider and geographic location.
- Detected, using data mining techniques, various anomalies in the credit portfolio which were related to incorrect settings in the banking core system.
- Established new TRIAD strategies to contain the higher risk customers in the strategies of authorizations for transactions employing machine learning techniques (mostly decision trees).
- Developed, programmed and implemented strategies, in batch processes, to increase credit limit for all segments of the portfolio (done with Oracle, SAS and TSYS). Set new queuing for collections in order to secure better recovery.
Environment: Machine Learning, Statistical techniques, SAS BASE, ETS, STAT (HP procedures), Connect, Visual Analytics, SAS Miner, ORACLE, R, Matlab, IBM SPSS.
Confidential, Des Moines, IA
Senior Quantitative Analyst
Responsibilities:
- Optimized in Excel-Python the reporting weekly investment committee processes having an impact of reducing the preparation time of 2 days to 1.5 hours.
- Reviewed and corrected SAS programming existing models not to use artificial as the independent variable input (this use was not in the sense auto-regressive).
- Programmed in Python a new Bayesian projection models to predict exchange rate levels and valuing these derivatives exchange.
- Implemented a user-friendly interface in Excel to simulate portfolio changes calculated in Python.
Environment: STATA, SPSS, Machine Learning, Statistical techniques, SAS BASE, ETS, STAT (HP procedures), Connect, Visual Analytics, SAS Miner, ORACLE, (pandas, connection to Microsoft-jet engine, Excel, ETL, sklearn, scikit, Mathlab).
Confidential
SAS Programmer & Risk Modelling Analyst
Responsibilities:
- Risk Management Designed and programmed in SAS base and macro-language the ALM montecarlo model to optimize the expected net present value of shareholders capital.
- Programmed in SAS the CreditMetrics methodology and adapted it to the local business. Analysed and corrected Parametric VAR models. Programmed a new SAS montecarlo model to replace the parametric VAR calculation.
Environment: SAS BASE, ETS, SQL, STAT, OR, Connect, SAS Miner, Statistical techniques, SQL Server, Matlab, R, MatLab, STATA.