Big Data / Etl Developer Resume
TECHNICAL SKILLS::
Big data /Data lake components: Hive (HQL), Pig latin, Apache Spark using Scala API, Sqoop, Flume, HDFS, H Catalog, Kafka, Oozie.
Scripting: Data science/Text mining/ Data wrangling with Python libraries - Matplotlib/SciPy/Numpy/Pandas/Seaborn/Scikit/NLTK/Tensor flow/ggplot libraries
Analytical tools: SAS enterprise data miner, Linear programing using Solver, Excel data analysis+
Statistical tools/Packages: R, SPSS, OPENSTATS and STATA
Machine learning/Deep Learning: Neural networks, Gradient Boosted Decision trees, Random forests, Naïve Bayes, Bayesian Networks, K Nearest Neighbor (KNN), Hyper parameter tuning, Regularization and Optimization, Supervised/Un supervised learning, Model & Multiclass evaluation, NLTK tool kit ( Natural language processing), Spark MLib.
Marketing Analytics: Google Analytics, Search engine optimization (SEO), Google AdWords, Campaign management
Data visualization: Tableau (Design, visual analytics, dashboards)
Data base skills: SQL scripts, MySQL DB, Teradata BTEQ scripts
Data Modelling - ERD data models:
No SQL skills- Mongo DB:
Cloud Platform: Google Cloud Platform (GCP)
Operating system: Windows, Linux, Mac OS
Financial tools: Tora trading.
ETL tools: Informatica power center 10.2
Version control/integrated dev env: Github, IntelliJ IDEA, Cran, Jupyter notebook, Filezilla, Putty,Infoworks, Maestro, Serena dimensions.
Domain expertise: Data science, Finance, Retail banking, Mutual funds, Human resource, Digital analytics, marketing analytics.
Machine learning algorithms: Linear regression, logistic regression, neural networks, Decision trees, Random forest, Gradient boosting, Bayesian networks, Naïve Bayes, Principle component analysis, Model selection.
BUSINESS EXPERIENCE:
Confidential
Big data / ETL developer
Confidential
Assistant Manager - Operations and Risk
Confidential
Financial Data Analyst Intern
Confidential Financial Analyst Intern
Responsibilities:
- Horton works professional services partner.
- Data science/Text mining/ Data wrangling with Python libraries -Matplotlib/SciPy/Numpy/Pandas/Seaborn/Scikit/NLTK/Tensor flow/ggplot libraries
- SAS enterprise data miner, Linear programing using Solver, Excel data analysis+
- Neural networks, Gradient Boosted Decision trees, Random forests, Naïve Bayes, Bayesian Networks, K Nearest Neighbor (KNN), Hyper parameter tuning, Regularization and Optimization, Supervised/Un supervised learning, Model & Multiclass evaluation, NLTK tool kit ( Natural language processing), Spark MLib.