Data Analytics Engineer Resume

PROFESSIONAL EXPERIENCE

Confidential

Data Analytics Engineer

Responsibilities:

Initiated the use of Liquibase in the organization which helped solve numerous database development and deployment problems
Developed Liquibase scripts to maintain database schema changes.
Use python to develop variety of models and algorithms for analytical purposes
Developed Health check monitoring tool for all the scheduled jobs using Python.
Converted Perl Scripts to Python.
Worked on joins and sub queries to simplify complex queries involving multiple tables.
Developed Python scripts to process logs and consolidate jobs ran on different servers.
Experienced in Automating, Configuring and deploying instances on AWS environments, also familiar with EC2, EBS Volumes, Cloud Formation and managing security groups on AWS.
Responsible for maintaining Linux based EC2 Instances on various environments.
Responsible for maintaining Windows based EC2 instances on various environments.
Experience in Linux/Windows environment (Cent OS, Windows Server 2012 R2)
Assist in designing, automating, implementing and sustainment of Amazon machine images (AMI) across the AWS Cloud environment.
Working knowledge on AWS Products and services like AWS EC2, EBS Volumes, Security Groups.
Hands on Installing and administrating CI/CD tools like JENKINS. Installed and configured Jenkins for automating Deployments and providing an automation solution.
Setup Fortify Code Scan for all the Internal Applications
Well versed in automation scripting using Perl, Python, Bash.
Knowledge in writing automation scripts using Shell scripting and to manage AWS.
Responsible for Continuous Integration (CI) and Continuous Delivery (CD) process implementation using Jenkins along with Shell scripts to automate routine jobs.
Develop product marketing model (Logistic Regression, Linear Regression, Decision Tree and others)
Used Python modules like Pandas and NumPy and date time to perform extensive data analysis.
Use Python to implement different machine learning algorithms including Linear Regression/ Logistic Regression, KNN, Random Forest, Decision Tree and SVM etc

Environment: AWS EC2, EBS, Oracle, Perl, Python, CloudFormation, XML, YAML, Jenkins, Ansible, Git, Cent OS, Windows Server 2012 R, Liquibase, AWS Security Groups

Confidential

Data Engineer

Responsibilities:

Identified patterns of behavior in customer migration to products and services.
Perform data cleansing, data imputation and data preparation using Pandas & Numpy
Experience In building end to end data pipelines for data transfers using Python & AWS including Boto3.
Prototyped pipelines using Databricks notebooks, Snowflake and PySpark.
Coordinated with vendor data teams to push and validate marketing data into a Snowflake data warehouse
Creating ETL pipeline using Python, Amazon S3, EC2, EMR and Snowflake database
Experience in Spinning up EMR & EC2 using Amazon Cloud Formation Templates and thru Console for line of business.
Deploying code to GitHub.
Developed pySpark jobs for data transformations as a process of data extraction.
Developed data pipelines for Inbound and Outbound data from vendors buckets using Python & AWS SDK Boto3
Worked with job schedulers to schedule jobs.
Managing Schema objects such as Tables, Views, Indexes and referential integrity depending on user requirements and converting them into technical specifications.

Confidential

Data Engineer

Responsibilities:

Migrated datasets from Teradata to AWS Snowflake thru S3 as Intermediate Storage.
Worked with Symphony data pipeline.
Good working experience on Unix Shell Scripting and reusable scripts.
Used Shell Scripting for reading Variables based on Environment.
Build data pipeline based on Spark, AWS EMR, AWS S3, AWS RDS.
Develop ETL logic by using Spark data frame and Dataset API in python.
Designed and implemented the risk assessments as microservices with RESTful API through Swagger using JSON.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop
Performed Data Quality Checks for data sets using Spark
Responsible for creation of data quality queries to ensure key data is accounted.
Responsible for delivering datasets from Snowflake to One Lake Data Warehouse.

Environment: AWS EMR, AWS EC2, AWS S3, AWS - CLI, AWS Boto3, Snowflake, UNIX Scripting, BASH, Control M, Java, SQL, Python, Spark, Jira, GitHub.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship