Sr Hadoop Developer Resume
Madison, WI
SUMMARY:
- Over Seven years of extensive IT experience with multinational clients which includes 3 years of Hadoop related architecture experience developing Bigdata / Hadoop applications.
- Hands on experience with the Hadoop stack (MapReduce, HDFS, Sqoop, Pig, Hive, HBase, Flume, Oozie and Zookeeper)
- Well versed in configuring and administrating the Hadoop Cluster using major Hadoop Distributions like Apache Hadoop and Cloudera
- Proven Expertise in performing analytics on Big Data using Map Reduce, Hive and Pig.
- Experienced with performing real time analytics on NoSQL data bases like HBase and Cassandra.
- Worked with Oozie work flow engine to schedule time based jobs to perform multiple actions.
- Hands on experience with importing and exporting data from Relational data bases to HDFS, Hive and HBase using Sqoop.
- Analyzed large amounts of data sets writing Pig scripts and Hive queries
- Experienced in writing MapReduce programs & UDFs for both Hive & Pig in Java
- Used Flume to channel data from different sources to HDFS.
- Experience with configuration of Hadoop Ecosystem components: Hive, HBase, Pig, Sqoop, Mahout, Zookeeper and Flume.
- Supported MapReduce Programs running on the cluster and wrote custom MapReduce Scripts for Data Processing in Java
- Experience with Testing Map Reduce programs using MRUnit, Junit and Easy Mock.
- Experienced with implementing Web based, Enterprise level applications using J2EE frameworks like Spring, Hibernate, EJB, JMS, JSF and Java.
- Experienced with implementing/consumed SOAP Web Services using Spring CXF and Consumed Rest Web Services using Http Clients.
- Experienced in writing functions, stored procedures, and triggers using PL/SQL.
- Experienced with build tool ANT, Maven and continuous integrations like Jenkins.
- Experienced in all facets of Software Development Life Cycle (Analysis, Design, Development, Testing and maintenance) using Waterfall and Agile methodologies
- Motivated team player with excellent communication, interpersonal, analytical and problem solving skills
- Highly adept at promptly and thoroughly mastering new technologies with a keen awareness of new industry developments and the evolution of next generation programming solutions
WORK EXPERIENCE:
Confidential - Madison, WI
Sr Hadoop Developer
Responsibilities:
- Involved in Installing, Configuring Hadoop Eco System, Cloudera Manager using CDH4 Distribution.
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data
- Integrated Quartz scheduler with Oozie work flows to get data from multiple data sources parallel using fork.
- Processed Multiple Data sources input to same Reducer using Generic Writable and Multi Input format.
- Created Data Pipeline of Map Reduce programs using Chained Mappers.
- Implemented Optimized join base by joining different data sets to get top claims based on state using Map Reduce.
- Implemented complex map reduce programs to perform joins on the Map side using Distributed Cache in Java.
- Responsible for importing log files from various sources into HDFS using Flume
- Created customized BI tool for manager team that perform Query analytics using HiveQL.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis
- Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
- Created Hive Generic UDF's to process business logic that varies based on policy.
- Moved Relational Data base data using Sqoop into Hive Dynamic partition tables using staging tables.
- Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution
- Worked on custom Pig Loaders and storage classes to work with variety of data formats such as JSON and XML file formats
- Experienced with different kind of compression techniques like LZO, GZip, Snappy.
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
- Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
- Experienced in Monitoring Cluster using Cloudera manager.
Environment: Hadoop, HDFS, HBase, MapReduce, Java, Hive, Pig, Sqoop, Flume, Oozie, Hue, SQL, ETL, Cloudera Manager, MySQL.
Confidential - Austin, TX
Hadoop Developer
Responsibilities:
- Worked on importing data from various sources and performed transformations using MapReduce, Hive to load data into HDFS
- Responsible for building scalable distributed data solutions using Hadoop
- Written various Hive and Pig scripts
- Created HBase tables to store variable data formats coming from different portfolios
- Performed real time analytics on HBase using Java API and Rest API.
- Implemented HBase Co-processors to notify Support team when inserting data into HBase Tables.
- Configured Sqoop jobs to import data from RDBMS into HDFS using Oozie workflows.
- Worked on setting up Pig, Hive and HBase on multiple nodes and developed using Pig, Hive, HBase and MapReduce
- Worked on compression mechanisms to optimize MapReduce Jobs
- Analyzed the customer behavior by performing click stream analysis and to ingest the data used flume.
- Experienced with working on Avro Data files using Avro Serialization system.
- Solved small file problem using Sequence files processing in Map Reduce.
- Implemented business logic by writing UDF's in Java and used various UDF's from Piggybanks and other sources.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager
- Worked on Oozie workflow to run multiple jobs.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Environment: Horton works, Map Reduce, HBase, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Java (jdk 1.6), Eclipse
Confidential - San Ramon, CA
Hadoop Developer
Responsibilities:
- Importing and exporting data into HDFS from database and vice versa using SQOOP
- Helped this medical group streamline business processes by developing, installing and configuring Hadoop ecosystem components that moved data from individual servers to HDFS.
- Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
- Created Cassandra tables using CQL to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Created POC to store Server Log data into Cassandra to identify System Alert Metrics
- Supported code/design analysis, strategy development and project planning.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Designed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Assisted with data capacity planning and node forecasting.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
Environment: Cloudera, Map Reduce, Cassandra, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Oozie, Java (jdk 1.6), Eclipse
Confidential - Phoenix, AZ
Java Programmer
Responsibilities:
- Coded Servlets on the server side, which gets the requests from the client and processes the same by interacting the Oracle database.
- GUI development using HTML Forms and Frames and validating the data With JavaScript.
- Used JDBC to connect to the backend database and developed stored procedures.
- Developed code to handle web requests involving Request Handlers, Business Objects, and Data Access Objects.
- Developed Java Servlets to control and maintain the session state and handle user requests.
- Carrying out actual development of JSP pages including the use of JSP custom tags and other methods of Java Beam presentation and all HTML and graphically oriented aspects of the site's user interface.
- Used XML to transfer data universally among different data sources.
- Involved in unit testing and documentation.
Environment: J2EE (Servlets, JSP, Java Scripts, Java Beans, JDBC 2.0, XML technologies), HTML, JavaScript, CSS, XML, Oracle 9i, PL/SQL, Apache Tomcat, JBuilder, UNIX.
Confidential
JAVA/J2EE Developer
Responsibilities:
- Developing light weight business component and integrated applications using struts
- Designing and developing front-end, middleware and back-end applications.
- Optimizing server/client side validation.
- Worked together with the team in helping transition from Oracle to DB2.
- Developed the global logging module which was used across all the modules using Log4Jcomponents.
- Developed the presentation layer for the credit enhancement module in JSP.
- Struts were used to implement the Model View Layer (MVC) architecture. Validations were done on the client side as well as the server side.
- Involved in the configuration management using ClearCase.
- Detecting and resolving errors/defects in the quality control environment.
- Using Ibatis for mapping Java classes with database.
- Involved in Code review and integration testing.
- Used Debugging tools such as PMD, Find Bugs and checkstyle.
Environment: Java v1.6, J2EE 6, Struts 1.2, iBatis, XML, JSP, CSS, HTML, JAVASCRIPT, JQuery, Oracle 10g, DB2, Unix, RAD, ClearCase, WebSphere V8.0 (beta)