We provide IT Staff Augmentation Services!

Data Scientist Resume

4.00/5 (Submit Your Rating)

TECHNICAL SKILLS

  • HDFS, MapReduce, Tez, Yarn
  • Scala, Python, Java
  • MongoDB, CouchDB, HBase, Casandra, Redis, Neo4J
  • Spark Streaming, Storm, Flink
  • PySpark, MLlib, GraphX, Spark Streaming, Databricks, RDD
  • Drill, Zeppelin, Presto, ElasticSearch, Spark SQL
  • Real - Time Data Streaming Pipeline Tool - Kafka
  • Data Scripting/Manipulation/Data Warehousing - Pig, Hive
  • Visualization Tools: Bokeh, D3.js, GraphQL
  • Distributed Machine Learning Tool - Mahout
  • Resource Monitoring Dashboard - Ambari, Hue
  • Job Workflow Scheduler - Oozie
  • Cluster Management - Mesos, Zookeeper

PROFESSIONAL EXPERIENCE

Confidential

Data Scientist

Responsibilities:

  • Responsible as a member of the Bionix Delivery Team to train existing AI/Machine Learning models and validate their results.
  • Process utilizes a SageMaker stack that employs microservices like S3, AWS Glue for ETL tasks, Lambda triggers, Redshift data warehousing, and coded in Python with Jupyter notebooks. We used Jira for team collaboration.
  • Worked on the Incident Enrichment Design project in the PDXC Intelligence Pillar hosted in the EMMA environment.
  • The AI/ML model training process starts when AWS Glue preprocess incoming S3 data, then invokes AWS Step Function to have SageMaker prepare for NTM model training.
  • AWS Batch reads the processed S3 data and applies feature engineering, then instructs SageMaker to create the NTM model and its corresponding inference endpoint.
  • Batch invokes the inference endpoint to start its group name and knowledge base predictions, afterwards store the information in another S3.
  • AWS Athena retrieves the new S3 data and adds it to its output table. Training for both the KB and Group Name prediction is handle by the SageMaker’s Supervised Blazing Text algorithm.

Confidential

Web Software Developer

Responsibilities:

  • Maintain and support existing Medical Imaging Corelab technology and infrastructure as well as design, build or debug solutions to implement web-based infrastructure for the next generation platform to support operational requirements and business needs.
  • Develop web-based applications using Java, JavaScript, JSON, XML, C#, PHP, HTML 5, and .NET 4.7 and .NET Core framework. This includes any new programming environment, language, or codebase necessary to achieve the company's project goals. I implemented Angular 7’s Drag-n-Drop functionality using the Material Design components to allow for virtual scrolling and dynamic loading/unloading of data for my applications. In-house expert Alfresco application developer to handle ECM and BPM based applications. Created Spring and Hibernate applications with JSP servlets in the backend. Integration and development of a robust, task-based workflow architecture to support operational requirements of the Medical Imaging Corelab operations in a web-based platform. Work closely with the operations and management team to develop the essential requirements and the phased development of various technologies needed by the company.
  • Develop systems using databases of both SQL and NoSQL environments. Implementing a GraphQL/Apollo solution.
  • One of our Cloud/Big Data projects required using Kubernetes and Hadoop for processing.

We'd love your feedback!