We provide IT Staff Augmentation Services!

Sr Data Scientist (consultant/ Independent Contractor) Resume

0/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • Senior / Lead Data scientist with strong technical expertise, business and leadership experience, and communication skills to drive high - impact business outcomes through data-driven innovations and decisions.
  • Strong knowledge of statistical methods (regression, time series, hypothesis testing, randomized experiments), machine learning techniques, algorithms, data structures and data infrastructure.
  • Extensive hands-on experience and high proficiency with structured, semi-structured and unstructured data, using a broad range of data science programming languages and big data tools including R, Python, Spark, PySpark, SQL, MongoDB, Scikit Learn, R Shiny & ShinyDashboard, Hadoop MapReduce, MRJob, Amazon AWS, REST APIs, Unix.
  • Solid team player, team builder, and an excellent communicator.

PROFESSIONAL EXPERIENCE:

SR Data Scientist (Consultant/ Independent Contractor)

Confidential

Responsibilities:

  • Used a variety of robust statistical techniques to ascertain gene expression significance in subjects with and without cancer tumors, and also between deceased and non-deceased Confidential ts.
  • Applied association rules mining (frequent item sets) algorithm to the gene variant data set to find the most commonly co-occurring gene variants in the subject population, for variants within a gene and across a pool of genes.
  • Developed a Shiny application that integrated these analytical insights along with the provider’s Confidential t medical records data into a Confidential t dashboard, enabling care providers to with instant information.
  • Designed and architected data science server infrastructure for multiple data science and analytics personnel to leverage and share

SR Data Scientist

Confidential

Responsibilities:

  • Started as a contractor to help develop data processing, mining and machine learning pipelines in Python, PySpark and Amazon EMR. Implemented clustering algorithm and ran it successfully over hundreds of millions of geo location mobile data points across hundreds of thousands of individuals using a combination of PySpark parallelism and scikit-learn machine learning libraries. Implemented extensive data transformations and summarizations across large scale of streaming data points using new PySpark data frame API on amazon EMR.

Data Scientist

Confidential, Atlanta, GA

Responsibilities:

  • Transformed the company’s rudimentary and ad-hoc approach to data insights and reporting to a streamlined, automated, algorithmic, statistically robust and highly visual strategic function.
  • Worked extensively with countless data sources ranging in variety from structured SQL tables to unformatted text files, and in size from few rows to tens of millions of rows and tens of gigabytes.
  • Drove unique insights into Confidential t payment and online engagement behavior using creative data engineering, machine learning and experimentation techniques.
  • Sample Technical Projects
  • Implemented a first-generation regression-based predictive model that pre-identifies consumers that will fail to pay their bills. The model is designed to allow for ~30% reduction in processed bills with ~97% accuracy (3% false positive rate).
  • Developed a HIPAA-compliant record linking solution to enable billing and online Confidential t wallet experience for Confidential ts that have multiple encounters at the same provider. Implemented a creative and a highly effective approach to link consumer records across providers to enable unique Confidential t- and guarantor-level data analytics, inference and insights.
  • Deployed a suite of self-service, web application portals based on R Shiny and ShinyDashboard to easily access, visualize and consume data analysis outputs on an on-demand and real-time basis.
  • Designed and implemented a distributed shortest-path algorithm using Hadoop MapReduce on Amazon AWS/EMR/S3 to find reachability distances in a large Wikipedia web graph.
  • Developed a click-through prediction application in Apache Spark using a regression-based machine-learning algorithm.
  • Built a document classification system using text mining and machine learning (multiple algorithms) in Python Scikit Learn to classify Web documents into topics.
  • Ran a real-life randomized experiment to determine whether competition and perceived competitor performance improves K-12 student performance in math.
  • Created YelpSquare by leveraging public API data from Yelp and Foursquare to make restaurant recommendations for a variety of social dining experiences (Python, MongoDB and AWS).

Head of Big Data Cloud Services

Confidential, Atlanta, GA

Responsibilities:

  • Turned the company’s most strategic business unit into a nimble execution machine. Held responsibility for cloud products and P&L. Helped customers in healthcare, gaming, ad tech, and mobile industries to cost-effectively deploy and operate and big data applications on a global scale.

Head of Product, Engineering, Data Center Operations

Confidential, Redwood City, CA

Responsibilities:

  • Led product and technology transformation from narrow software focus into a dynamic portfolio of high-value software, cloud and big data offerings, resulting in 6x sales growth and profitability.

We'd love your feedback!