We provide IT Staff Augmentation Services!

Data Analytics Engineer Resume

4.00/5 (Submit Your Rating)

PROFESSIONAL EXPERIENCE

Confidential

Data Analytics Engineer

Responsibilities:

  • Initiated the use of Liquibase in the organization which helped solve numerous database development and deployment problems
  • Developed Liquibase scripts to maintain database schema changes.
  • Use python to develop variety of models and algorithms for analytical purposes
  • Developed Health check monitoring tool for all the scheduled jobs using Python.
  • Converted Perl Scripts to Python.
  • Worked on joins and sub queries to simplify complex queries involving multiple tables.
  • Developed Python scripts to process logs and consolidate jobs ran on different servers.
  • Experienced in Automating, Configuring and deploying instances on AWS environments, also familiar with EC2, EBS Volumes, Cloud Formation and managing security groups on AWS.
  • Responsible for maintaining Linux based EC2 Instances on various environments.
  • Responsible for maintaining Windows based EC2 instances on various environments.
  • Experience in Linux/Windows environment (Cent OS, Windows Server 2012 R2)
  • Assist in designing, automating, implementing and sustainment of Amazon machine images (AMI) across the AWS Cloud environment.
  • Working knowledge on AWS Products and services like AWS EC2, EBS Volumes, Security Groups.
  • Hands on Installing and administrating CI/CD tools like JENKINS. Installed and configured Jenkins for automating Deployments and providing an automation solution.
  • Setup Fortify Code Scan for all the Internal Applications
  • Well versed in automation scripting using Perl, Python, Bash.
  • Knowledge in writing automation scripts using Shell scripting and to manage AWS.
  • Responsible for Continuous Integration (CI) and Continuous Delivery (CD) process implementation using Jenkins along with Shell scripts to automate routine jobs.
  • Develop product marketing model (Logistic Regression, Linear Regression, Decision Tree and others)
  • Used Python modules like Pandas and NumPy and date time to perform extensive data analysis.
  • Use Python to implement different machine learning algorithms including Linear Regression/ Logistic Regression, KNN, Random Forest, Decision Tree and SVM etc

Environment: AWS EC2, EBS, Oracle, Perl, Python, CloudFormation, XML, YAML, Jenkins, Ansible, Git, Cent OS, Windows Server 2012 R, Liquibase, AWS Security Groups

Confidential

Data Engineer

Responsibilities:

  • Identified patterns of behavior in customer migration to products and services.
  • Perform data cleansing, data imputation and data preparation using Pandas & Numpy
  • Experience In building end to end data pipelines for data transfers using Python & AWS including Boto3.
  • Prototyped pipelines using Databricks notebooks, Snowflake and PySpark.
  • Coordinated with vendor data teams to push and validate marketing data into a Snowflake data warehouse
  • Creating ETL pipeline using Python, Amazon S3, EC2, EMR and Snowflake database
  • Experience in Spinning up EMR & EC2 using Amazon Cloud Formation Templates and thru Console for line of business.
  • Deploying code to GitHub.
  • Developed pySpark jobs for data transformations as a process of data extraction.
  • Developed data pipelines for Inbound and Outbound data from vendors buckets using Python & AWS SDK Boto3
  • Worked with job schedulers to schedule jobs.
  • Managing Schema objects such as Tables, Views, Indexes and referential integrity depending on user requirements and converting them into technical specifications.

Confidential

Data Engineer

Responsibilities:

  • Migrated datasets from Teradata to AWS Snowflake thru S3 as Intermediate Storage.
  • Worked with Symphony data pipeline.
  • Good working experience on Unix Shell Scripting and reusable scripts.
  • Used Shell Scripting for reading Variables based on Environment.
  • Build data pipeline based on Spark, AWS EMR, AWS S3, AWS RDS.
  • Develop ETL logic by using Spark data frame and Dataset API in python.
  • Designed and implemented the risk assessments as microservices with RESTful API through Swagger using JSON.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop
  • Performed Data Quality Checks for data sets using Spark
  • Responsible for creation of data quality queries to ensure key data is accounted.
  • Responsible for delivering datasets from Snowflake to One Lake Data Warehouse.

Environment: AWS EMR, AWS EC2, AWS S3, AWS - CLI, AWS Boto3, Snowflake, UNIX Scripting, BASH, Control M, Java, SQL, Python, Spark, Jira, GitHub.

We'd love your feedback!