Data Analytics Engineer Resume
4.00/5 (Submit Your Rating)
PROFESSIONAL EXPERIENCE
Confidential
Data Analytics Engineer
Responsibilities:
- Initiated the use of Liquibase in the organization which helped solve numerous database development and deployment problems
- Developed Liquibase scripts to maintain database schema changes.
- Use python to develop variety of models and algorithms for analytical purposes
- Developed Health check monitoring tool for all the scheduled jobs using Python.
- Converted Perl Scripts to Python.
- Worked on joins and sub queries to simplify complex queries involving multiple tables.
- Developed Python scripts to process logs and consolidate jobs ran on different servers.
- Experienced in Automating, Configuring and deploying instances on AWS environments, also familiar with EC2, EBS Volumes, Cloud Formation and managing security groups on AWS.
- Responsible for maintaining Linux based EC2 Instances on various environments.
- Responsible for maintaining Windows based EC2 instances on various environments.
- Experience in Linux/Windows environment (Cent OS, Windows Server 2012 R2)
- Assist in designing, automating, implementing and sustainment of Amazon machine images (AMI) across the AWS Cloud environment.
- Working knowledge on AWS Products and services like AWS EC2, EBS Volumes, Security Groups.
- Hands on Installing and administrating CI/CD tools like JENKINS. Installed and configured Jenkins for automating Deployments and providing an automation solution.
- Setup Fortify Code Scan for all the Internal Applications
- Well versed in automation scripting using Perl, Python, Bash.
- Knowledge in writing automation scripts using Shell scripting and to manage AWS.
- Responsible for Continuous Integration (CI) and Continuous Delivery (CD) process implementation using Jenkins along with Shell scripts to automate routine jobs.
- Develop product marketing model (Logistic Regression, Linear Regression, Decision Tree and others)
- Used Python modules like Pandas and NumPy and date time to perform extensive data analysis.
- Use Python to implement different machine learning algorithms including Linear Regression/ Logistic Regression, KNN, Random Forest, Decision Tree and SVM etc
Environment: AWS EC2, EBS, Oracle, Perl, Python, CloudFormation, XML, YAML, Jenkins, Ansible, Git, Cent OS, Windows Server 2012 R, Liquibase, AWS Security Groups
Confidential
Data Engineer
Responsibilities:
- Identified patterns of behavior in customer migration to products and services.
- Perform data cleansing, data imputation and data preparation using Pandas & Numpy
- Experience In building end to end data pipelines for data transfers using Python & AWS including Boto3.
- Prototyped pipelines using Databricks notebooks, Snowflake and PySpark.
- Coordinated with vendor data teams to push and validate marketing data into a Snowflake data warehouse
- Creating ETL pipeline using Python, Amazon S3, EC2, EMR and Snowflake database
- Experience in Spinning up EMR & EC2 using Amazon Cloud Formation Templates and thru Console for line of business.
- Deploying code to GitHub.
- Developed pySpark jobs for data transformations as a process of data extraction.
- Developed data pipelines for Inbound and Outbound data from vendors buckets using Python & AWS SDK Boto3
- Worked with job schedulers to schedule jobs.
- Managing Schema objects such as Tables, Views, Indexes and referential integrity depending on user requirements and converting them into technical specifications.
Confidential
Data Engineer
Responsibilities:
- Migrated datasets from Teradata to AWS Snowflake thru S3 as Intermediate Storage.
- Worked with Symphony data pipeline.
- Good working experience on Unix Shell Scripting and reusable scripts.
- Used Shell Scripting for reading Variables based on Environment.
- Build data pipeline based on Spark, AWS EMR, AWS S3, AWS RDS.
- Develop ETL logic by using Spark data frame and Dataset API in python.
- Designed and implemented the risk assessments as microservices with RESTful API through Swagger using JSON.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Oracle into HDFS using Sqoop
- Performed Data Quality Checks for data sets using Spark
- Responsible for creation of data quality queries to ensure key data is accounted.
- Responsible for delivering datasets from Snowflake to One Lake Data Warehouse.
Environment: AWS EMR, AWS EC2, AWS S3, AWS - CLI, AWS Boto3, Snowflake, UNIX Scripting, BASH, Control M, Java, SQL, Python, Spark, Jira, GitHub.