Etl Hadoop Developer Resume
Cleveland, OH
SUMMARY
- Over 8 years of experience in Analysis, Design, Development, Testing, Implementation, Maintenance and Enhancements on various IT Projects.
- Around 3+ years of experience in Big Data in implementing end - to-end Hadoop solutions.
- Strong basics and understanding of HDFS and the complete Hadoop Eco-system.
- Strong experience and knowledge in Hadoop Ecosystem including MapReduce, Hive, Pig, Hbase, Hue, Sqoop, Oozie and Scala.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
- Proficient in automating the data workflow scheduling using Oozieand Shell scripting, large scale data ingestions using SQOOP and ETL tasks using PIG.
- Extensive knowledge in using SQL Queries for backend database analysis.
- Developed InformaticaETL mappings and work flows to move the daily generated data from DB2 source systems to RDBMS and HDFS using different transformations like Source Qualifier, Expression and Lookup.
- Experience in Extracting, Transforming and Loading (ETL) process from the client input text files using SQL server.
- Created, tested, performance tuned, and implemented complex stored procedures using Microsoft SQL Server 2012.
- Developed custom processes for clients based on their specific requirements utilizing SQL Server.
- Designed quality assurance SQL scripts for testing the ETL process.
- Expertise in Core Java, J2EE, Multithreading, JDBC and proficient in using Java API’s for application development.
- Worked on different operating systems like UNIX/Linux, Windows XP, and Windows 2K
TECHNICAL SKILLS
Big Data Ecosystems: HDFS, Map Reduce, HBase, Hive, Pig, Sqoop, Oozie, Scala Basics
J2EE Technologies: Servlets, Jsp, JDBC
Languages: Java, SQL, C#, C, C++, HTML, Java script, XML
ETL Tools: Informatica 9.5.1 - Power Center
Concepts: Core DBMS, Data mining
Tools: DBVisualizer, DQ Analyzer
Operating Systems: UNIX/Linux, Windows 98, 2000, XP, Vista, Windows 7
Office Tools: MS Word - Powerpoint- Excel - SSIS(Basics)
Servers: Tomcat
PROFESSIONAL EXPERIENCE
Confidential, Cleveland, OH
Big Data Analyst
Environment: Linux, Informatica, Hadoop, Sqoop, Hive, Oozie, HDFS, Pig, Hue
Responsibilities:
- Interacted with the client directly to figure out the requirements of the project.
- Used SQOOP to perform continuous data ingestions from SQL and other source systems.
- Implemented the data transformations using Hive and Pig.
- Used Ooziework flows to schedule the shell, hive, pig actions.
- Worked on error handling in shell scripting.
- Involved in loading data from RDBMS into HDFS using Sqoop queries.
- Handled Delta processing or incremental updates using hive and processed the data in hive tables
- Worked on Hive optimization techniques to improve the performance of long running jobs.
- Responsible to manage data coming from different sources.
- Involved in creating Pig tables, loading with data and writing Pig Latin queries which will run internally in Mapreduce way.
- Documented the systems processes and procedures for future references.
- Involved in writing Unix/Linux Shell Scripting for scheduling jobs and for writing hive QL.
- Created hive tables for the moved files in HDFS.
- Used hive for transformations on the tables.
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
- Used Oozie Scheduler system to automate the pipeline workflow and orchestrate the map reduces jobs that extract the data on a timely manner.
- Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews, test development, test automation.
- Involved in story-driven agile development methodology and actively participated in daily scrum meetings with the team of 6 people.
- Responsible for preparing the File level and Column level metadata.
- Involved in Data modeling like preparing the source and target mapping of the fields based on source data etc.
- Extensively used the Hue browser for interacting with Hadoop components.
- Responsible for moving the data from source (Teradata, Oracle) to HDFS and Unit Testing of the files moved.
- Actively involved with infrastructure teams in addressing the platform issues.
- Worked on transformation to create a new dataset using an existing dataset.
- Involved in testing activities within QA environment which include System testing, Integration testing and writing test cases.
- Created ETL Processes using Informatica.
- Used Informaticato design the mappings and ETL workflows to transfer the data from the source systems (DB2, Oracle) to HDFS.
- Created events and tasks in the workflows using workflow manager
- Designed Informatica mappings by translating the business requirements.
- Developed reusable sessions and workflows.
- Widely used Informatica client tools -- Source Analyzer, Warehouse designer, Mapping designer, Transformation Developer and Informatica Workflow Manager.
- Used workflow manager for session management, database connection management and scheduling of jobs.
Confidential, Cleveland, OH
ETL Hadoop Developer
Environment: Windows, SQL, Hadoop, Pig, Hive, Oozie
Responsibilities:
- Worked as a production support for this project and tried to resolve issues as a part of daily operations and monthly processing as a part of oozie workflows.
- Used Hive and SQL for the enhancements of the project.
- Used hive, SQOOP, Oozie to design ETL process in order to load the data provided by the client.
- Designed Quality Assurance scripts in SQL.
- Interacted with the client directly to gather the requirements for the enhancements.
- Worked hard to act as a production support for the daily, weekly, and monthly activities.
Confidential, Detroit, MI
ETL Hadoop Developer
Environment: Linux, Windows, SQL, Java, SQOOP, Hive, Oozie, Informatica.
Responsibilities:
- Used SQL to design ETL process in order to load the input data provided.
- Used SQL to query the data in RDBMS and analyze it.
- Used Servlets to connect to the RDBMS and to display the reports in the front end.
- Used JSP Pages to design a dashboard for the client to view the reports and other data.
- Used Java to connect to RDBMS to display the current updated information of the records dynamically as requested by the client from the UI dashboard.
- Coordinating with the Subject Matter Experts on the changes and issues.
- Designed quality assurance SQL scripts for testing the ETL process.
- Generating the Monthly Reports Deck and daily reports DECK by using the automated SQL jobs and making sure all reports are generated and available on the dashboard.
- Involved in the creation of SQL Tables, Views, and SQL stored procedures and functions.
- Loading Data into the staging tables every day from which the reporting and other related queries are used.
- Upgraded the ETL process to use HDFS for the data storage and other Hadoop based tools like sqoop, oozie, hive etc. to achieve the ETL process.
- Worked on automating the ETL processes using oozie workflows.
- Worked on the Performance Issues to ensure that the hive queries execute faster using the TEZ execution engine.
- Used shell scripting to automate the daily processing.
- Worked on four ETL load types namely full, CDC, incremental and Upserts
- Experienced in the use of agile approaches, including Extreme Programming, Test-Driven Development and Scrum.
Confidential
Java Developer
Environment: Java, Servlets, JSP, Javascript, Tomcat.
Responsibilities:
- Involved right from the inception phase
- Understanding and analyzing the requirements
- Prepared screenshots for all modules.
- Coordinate with Project managers, development and QA teams during the course of the project.
- Developed Controller and Connection pool.
- Database design to all modules
- Incorporated Internationalization.
- Used Eclipse IDE for developing the applications
- Involved in generating Fingerprint based attendance for management, staff and students.
- Involved in sending Bulk SMS.
- Involved in Production Support activities for quick fixes using SQL.
- Involved in code review and testing for library management and progress report.
- Prepared documentation and participated in preparing user’s manual for the application.
- Prepared deployment plans for production deployments.
- Involved in writing client side validations using Javascript.
- Used Tomcat web server for development purpose.
- Provided daily development status, weekly status reports, and weekly development summary and defects report.
- Responsible to remove bugs and add new features, enhance performance and scalability in application.
Confidential
Java Developer
Environment: Core Java, JSP, Servlets, Tomcat.
Responsibilities:
- Involved in the Enhancement project of the product, which involved development of Front-end module using JSP.
- Use case development and database design for all modules.
- Used Design Patterns like MVC and Singleton.
- Involving in Inventory Management.
- Involving in Code Review of Expenditure Management.
- Prepared Screenshots for all the modules.
- Gathered requirements from the users for enhancement or any issues and facilitate the transfer of the same requirements to our team.
- Work on the design and do the coding of the same.
- Creating and managing the defect reports.
- Conducted knowledge sharing sessions for the team to be at part with business rules.
- Guiding the team technically in resolving any issues being faced during coding
- Active participation in the team meetings.
- Involved in creating MYSQL stored procedures for data/business logic.
- Involved in the Enhancement project of the product, which involved development of Front-end
- Module using JSP.
- Performed code reviews to ensure code meets coding standards.
Confidential
Java Developer
Environment: Java, JSP, Servlets, Tomcat server 5.0, SQL Server 2000, Data mining
Responsibilities:
- Developed a parameterizable technique to recommend indexes based on index types.
- Reduced the time taken to answer the query.
- Involved in code review.
- Involved in User Acceptance and Use case Testing.
- Developed technical designs for application development.
- Developed application code for java programs.
- Developed all logical and physical models and deploy all applications and provide excellent documents for all processes.
- Involved in unit testing and resolving test defects.