We provide IT Staff Augmentation Services!

Principal Data Scientist / Manager Resume

3.00/5 (Submit Your Rating)

SUMMARY

  • Self - motivated and detail-oriented Data Scientist with years of experience in modeling and real-time data analysis, working with global teams on building end to end solutions by extracting actionable insight from IoT data.
  • 4 Years working on Data Science and Machine Learning.
  • 10 years of experience developing Networking Product with TCP/IP software development for L3 Routers and L2 switches on triple-play (voice, video, data) for Gigabit Ethernet, Cloud Network/ Infrastructure, Fiber To The Home, FTTC.
  • 6 years of Managing and Leading cross-functional teams globally.
  • Examining Raw Data, identifying its structure / trends, deriving conclusions / hypotheses and Pattern Recognition.
  • Proven Track record of leveraging data to drive significant business impact.
  • Presentation to C-level Executives and Department of Defense(DoD) Executives using Graphs, Charts (ggplot2) and D3 Visualizations.
  • Expertise in Predictive Analytics/ Statistical Modelling / Machine Learning with R and Python, Stata.
  • Worked on Multivariate Regression, Logistic Regression, SVM, RF, kNN, Clustering, NaiveBayes, PCA, LDA, QDA, Lasso / Ridge, Decision Trees, NN.
  • Built Real-time Data Pipelines for Streaming Data from IoT Devices with Python and Scala using AWS S3, DynamoDB, RedShift, Amazon ML, EC2, AWS VPC’s.
  • Worked with SQL and NoSQL solutions with HBase, Cassandra, Impala, Spark, Kafka, Flume, Hadoop, YARN, Hive, SQL, Pig, VoltDB, Matlab, Octave.
  • Worked with Centos6.7 and Cloudera Virtual Machine.
  • Worked with Spark MLib, Amazon ML, Azure ML, Python (Pandas, NumPy, SciPy, scikit-Learn, NLTK), NLP, Weka, Octave, H2O.
  • Programming in Java, Scala, R, Python, Perl, Tcl/Tk, Expect, Bash on Linux.
  • Worked on different domains like IoT, Text mining, images analysis, time series, GIS, Genome Data, Health, Electrical, Financial Data in various formats like, json, csv, txt, excel, tabular, columnar, parquet files etc.
  • Worked with Graph Databases like Neo4J and GraphX.
  • Web Scrapping, Data Cleaning, data mining, Data Visualization (ggplot2, rCharts,\ D3, seaborn, bokeh, cartoDB).
  • Built new generation of Intelligent Smart Phone Applications, BackEnd Services with Distributed Systems, Vwware ESXi, vSphere, Android, NDK, Java, JNI.
  • Worked SDN, Virtualization, and Solution teams for Data Centers as Tech Lead.
  • Developed the solution for NTT, AT&T, Deutsche Telekom, China Telecom, British Telecom, and reproduce production issues in systems team and solution teams.
  • Worked in development JVM for Samsung Smart TV.
  • Ability to learn and apply new concepts quickly.
  • Love the action that goes in building products and solving problems.

TECHNICAL SKILLS

  • R, Python, Pandas, Numpy, SciPy, SckitLearn, Octave. Network, Routing, Switching, IPv6/IPv4, SDN, HAVi.
  • Statistical Package Technologies Controller Protocols
  • VPLS, L2VPN, RSTP, RTOS
  • Operating Systems Cloud Technologies Languages Scripts
  • BigData Technology
  • Development Tools Debugging Tools Tools Used Traffic Generator Other Tools
  • Big Data Linked in
  • NoX, OpenDaylight, RYU, FloodLight, Beacon L3VPN, BGP, OSPF, ISIS, IPv6 / IPv4, MPLS, LDP, RSVP, IGMP, LACP, HQoS, DHCP, DNS, NTP, AAA, TCP/IP.
  • VxWorks, OSE 4.2,pSoS
  • Linux/Unix, Android, IOS, Solaris, Windows.
  • AWS-EC2, S3, DynamoDB, RedShift, MicroSoft Azure.
  • R, Python, Scala, Java, C, C++.
  • Python, Erlang, Tcl/Tk, Perl, Expect and JavaScript.
  • Spark, Hadoop, HBase, Cassandra, Impala, Spark, Kafka, Flume, VoltDB, Neo4J and GraphX.
  • Eclipse, Tornado2.0 (VxWorks), Jeode, JTRON.
  • GDB, Multi 2000, Tornado debugging tools Agilent N2X, Mu dynamics, Pro Sniffer, Ethereal IXIA 1600T, Smartbits. Agilent, N2X GIT, Perforce, Mercurial, Extra View, Code Striker, MKS, Test Director, BugZilla, Jira, Maven, TestNG, Subversion: Spark, Hadoop, Storm, MRJob, H20.

PROFESSIONAL EXPERIENCE

Confidential

Principal Data Scientist / Manager

Responsibilities:

  • Data Scientist at Confidential office responsible for developing the Data Science Projects. Involved in requirement analysis and development activities and handling the clients for Hawaii Electric, Oahu, Molokai, Lanai, SanDiego Gas&Elec, So.Cal.Edison, PGE etc. Analyzed Raw IoT Data from various devices and provided insights into how to identify structure, deriving patterns, efficiently store and retrieve data, Predicting the amount of Electrical Energy produced in any given day.
  • Worked on Statistical Analysis and building the prediction Model for Various Solar Power Data. Worked with R, Python, AWS EC2, S3, DynamoDB, Redshift, Scala and Java. Implemented bootstrapping methods, Random Forests (Classification), K-Means Clustering, QDA, LDA, Lasso and Ridge Regression, KNN (K nearest neighbors), Naive Bayes, SVM (Support vector Machines), Decision Tree, Linear and Logistic Regression Methods.
  • Worded with Dimensionality Reduction methods such as PCA (Principal component Analysis), Factor Analysis etc.
  • Architected the Hawaii Electric Collaborate(HECO).
  • Designed and developed the data Pipeline in Python (Pandas, Numpy, SciPy, Jupyter) Proven Track record of leveraging data to drive significant business impact.
  • Presented to C-level Executives using Graphs, Charts (ggplot2) and D3 Visualizations. Demoed the PoC to the clients.

Data Science Course

Confidential

Responsibilities:

  • Worked on statistical testing and learning, linear and non-linear regression, clustering and classification, support vector machines, and decision trees.
  • Worked on datasets from diverse domains such as finance, genomics, and customer sales and world health data.
  • Used Dimensionality Reduction methods such as PCA (Principal component Analysis), Factor Analysis etc. Implemented bootstrapping methods such as Random Forests (Classification), K-Means Clustering, QDA, LDA, Lasso and Ridge Regression, KNN (K nearest neighbors), Naive Bayes, SVM (Support vector Machines), Decision Tree,
  • Linear and Logistic Regression Methods.
  • Worked with Spark, Hadoop, Cassandra, Impala, Kafka on Cloudera VM and Centos 6.7 Expert in Hypothesis testing, ANOVA, and Linear and Logistic Regression Analysis. Experience of working in text understanding, classification, pattern recognition, recommendation systems, targeting systems, and ranking systems using Python. Familiar with model testing, problem analysis, model comparison and validation. Collected data from various database, cleaned data for statistical analysis and model. Deep understanding of Statistical Modeling, Multivariate Analysis, Standard Procedures. Experience on HDFS, Map Reduce, Pig and Hive.
  • Experience with Apache Spark, Spark SQL, Spark Streaming and Spark MLLib.
  • Detail oriented professional, ensuring high level of quality in reports & data analysis. Experience in Data Modeling, Data warehousing, ETL Processing, Business Intelligence, Database Programming, EDW, ETL Testing and DBMS.
  • Extensive experience in working with medium to large scale data in delivering Data warehouse/BI solutions. Passionate about data.

We'd love your feedback!