Lead Data EngineerÂ Resume Â Schaumburg, IL - Hire IT People

SUMMARY:

9.5 years of experience in IT industry experience in Analysis, Design, Development and implementations of various Web and Big Data Applications.
4 years of Big Data experience with extracting, transforming and curating data on Terabyte scale.
Experience in design, development and implementation of Confidential solutions.
Experienced with data architecture including data ingestion pipeline design.
Experienced working with JIRA for project management, GIT for source code management, JENKINS for continuous integration
Experience with Hadoop ecosystem (Apache Pig, Hive, Sqoop, Spark, HBase, Phoenix, MapReduce).
Excellent knowledge in building and scheduling Confidential workflows with Shell scripts and Auto - sys.
Designed custom Hadoop job monitoring system for real time job status with BO dashboard.
Experienced in importing and exporting data from the different Data sources using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
Experience in High-Performance Computing Cluster and Enterprise Control Language - ECL.
Experience in Transforming data into JSON and AVRO formats.
Knowledge of Cloud Infrastructure Microsoft Azure.
Proficient in working with various technologies like Java/J2EE, JSF, JavaBeans, Spring MVC and REST.
Expertise in working with NoSQL Databases like MongoDB, Neo4J.
Experienced with different version management software - SVN, Star Team, Source Tree and GitHub.
In depth understanding of HDFS architecture and MapReduce framework.
Experience in Object Relational Mapping Frameworks such as Hibernate.
Experienced in web-based GUIs technologies JSF, XHTML, HTML, JavaScript AngularJS, and CSS.
Relational Database experience in Oracle, MS SQL Server, DB2.
Experience in Water fall and Agile/Scrum methodologies - Feature Driven Development.
Developed Confidential Visualization using D3JS.

TECHNICAL SKILLS:

Big Data : Hadoop, Hive, Pig, Sqoop, HPCC, ECL, Spark, HBase, Phoenix, NiFi, Ambari

NoSQL: MongoDB

Java Technologies: Java/JEE, Spring Web flow, Spring REST, JSF-2.0, Struts, Junit

Scripting Languages: JavaScript, Unix Shell Scripting

Databases: Oracle, SQL Server, DB2

Servers: IBM WebSphere 8.5, Apache Tomcat, JBOSS

Frameworks: Spring MVC, JSF, Struts 2

Version Control: Star Team, Subversion, Source Tree, Git

Tools: Visio, SQL Developer, Mongo Vue, Putty, Jira, Maven, Autosys, Jenkins

Professional Experience:

Lead Data Engineer

Confidential, Schaumburg, IL

Responsibilities:

Analyzed source table and identify the best split key and optimum mapper counts for sqoop.
Developed User Defined Function (UDF) to extend the functionality of Hive and Pig
Developed programs to convert embedded XML into JSON for Hive Table creation.
Created External and Managed Hive tables and views from Json and Avro files.
Design and Develop architecture for data curation.
Implemented automated tools to do automated cluster to cluster copy.
Implemented data curation logic to flatten data from various sources.
Identifying optimum partitioning and clustering to optimize data flow through cluster.
Implemented Hive queries for data analysis to meet the requirements.
Creation of Pig Latin Scripts to merge incremental data.
Implemented HCatLoader in Pig to load data from Hive table.
Integration of Hadoop jobs with Work Flow Management system, Autosys.
Implementation of storing data to HBase table using Apache Phoenix.
Designed tool to identify changes in Hive meta store and notify users.

Data Engineer

Confidential, Schaumburg, IL

Responsibilities:

Experience in analyzing the data source to identify the best data ingestion strategy.
Designed and Implemented incremental ingestion strategy for various sources.
Expertise in writing Pig Latin, Hive Scripts to extended their functionality using User Defined
Implemented Sqoop to bring data from various source databases like Netezza, DB2, SQL Server.
Implemented incremental data load strategy from full load to change data capture approach.
Created schema comparator program to identify the schema changes during incremental load.
Created Hive tables to load json files and avro data sets.
Identifying the partition column for data sets for Hive tables.
Implemented pig program to load data from landing zone to Hive tables.
Created data cleansing UDF in Pig to clean the data and add metadata columns.
Created scripts to load data to Hive tables via dynamic partition.
Implemented program to merge small lookup files to load as json Hive tables.
Developed POC to load data from external websites (NOAA) to ingest data via Apache NIFI.
Developed POC to visualize Hive data using Neo4J

Project Lead

Confidential, SC

Responsibilities:

Lead the team in implementing DPA modernization use cases using new technologies.
Gathering Functional and Usability Requirements for DPA from business users.
Designed and developed data model for Mongo DB.
Demonstrated DPA POC to business stakeholders which later approved for full scale implementation.
Used Spring Mongo for Angular MongoDB communication.
Designed and development of Mongo collections.
Used Angular JS, CSS to develop HTML pages as UI.
Used Spring REST to communicate with Angular front end.
Coordinated with offshore UI design and development teams on business requirements.
Code review and integration of offshore deliverables and deployment of DPA application
Leveraged the existing SOAP based backend web services by creating a UI proxy, which takes REST requests from the Angular frontend and issues SOAP based calls to backend services.
Developed Mongo DB layer for caching and session management.

Environment: MongoDB, Spring Framework, Spring Data, Angular JS, IBM Websphere 8.5, Star Team, Mongo Vue, Jenkins

Confidential, Atlanta, GA

Developer

Responsibilities:

ETL Processing of Flight data using HPCC ECL for Innovata (Flight Global- Reed Elsevier).
Project Lead for Flight Data Analysis and HPCC Quality Center - Automation.
Responsible for Requirement Gathering Analysis and Business user meetings.
Experience in handling Confidential applications up to .2 Petabytes on 400 node cluster.
Design and Development of ECL jobs to process Motor Vehicle Records for BI reporting.
Performed data loading from Landing Zone to clusters using ECL watch.
Expert in writing Thor jobs with varied transformation logic like Filter, Transform, Joins.
Implemented job statistics reporting tool in ECL for daily BI report analysis.
Developed Data visualization (report) using D3JS by consuming ECL jobs published in Roxie.
Responsible for Performance tuning of ECL jobs by analyzing job execution graph.
Implemented best practices such as naming conventions and reusable shared library (ECL bundles).
Processing and Visualization of Sydney Airport Passenger Transit dataset.
ECL automation framework to run via HP Quality Center for analyzing data loaded in various domain clusters which saved Quality Control team 50% of manual testing effort

Environment: High Performance Computing Clusters, ECL, ECL IDE, ECL Watch

Sr. Software Engineer

Confidential

Responsibilities:

Lead developer for Login and Payment modules.
Developed web applications for selling credibility reports and telephonic sales.
Experience in Feature Driven Development (FDD) software development methodology.
Responsible for converting functional requirements into Technical specifications.
Database design for promotions and user security modules.
Worked on GUI with Java Script frameworks JQuery.
Used Log4J logging framework. Log messages with various levels are written in all the Java code.
Developed server side code using Spring Web flow, JSF, Hibernate.
Query optimization to improve application performance.
Handled full SDLC by being part of the Support team after the production deployment
Managed the support team effectively, resolving production issues.
Developed Stored Procedures in SQL Server 2008 to improve application performance.
Mentored new associates to scale up for the project.

Environment: Java, HTML,SQL Server, Java script, JSF, Hibernate, Spring, Web-flow, Eclipse, Git, Jenkins, Sumo Logic, Jira, Putty.

Software Engineer

Confidential

Responsibilities:

Design and development of Health Companion UI.
Involved in all phases of SDLC - Requirement Analysis, Design and Development using Struts
Migrating the application from Struts 1.2 to 2.
Actively involved in UI design for application.
Developed entire JSP pages for the application.
Created struts.xml file for the Action Servlet send request to specified instance of action class.
Used Hibernate to communicate with Oracle database.
Developed the User interface screens using HTML, JSP and AJAX.
Unit Testing of Health Companion Portal using Junit.

Environment: Java, HTML, Oracle, Java script, JSP, Hibernate, Struts, Eclipse

We provide IT Staff Augmentation Services!

Lead Data EngineerÂ Resume

Schaumburg, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship