Hadoop Developer Resume
Providence, RI
SUMMARY:
- Hadoop/Java Developer with nearly 7 years of experience as software developer in design, development, deployment and supporting large scale distributed systems.
- 3+ years of experience as Hadoop Developer and Big Data analyst.
- Excellent understanding of Hadoop architecture and underlying framework including storage management.
- Experienced in installing and configuring Hadoop v1.0 and v2.0.
- Experience with multiple Cloudera Distribution versions like CDH 3, CDH4, and CDH5.
- Good knowledge of MAPR and Horton Works Distributions for Hadoop.
- Expertise in using various Hadoop Ecosystem components such as MapReduce, Pig, Hive, HBase, Sqoop, Oozie, Flume and Spark for data storage and analysis.
- Experienced in developing custom UDF’s for Pig and Hive to in corporate methods and functionality of Python/Java into PigLatin and HiveQL.
- Highly experienced in importing and exporting data between HDFS and Relational Systems like MySQL and Teradata using Sqoop.
- Experienced in collection of Log Data and JSON data into HDFS using Flume and processed the data using Hive/Pig.
- Expertise in Text Processing and Analysis using HiveQL.
- Experience in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.
- Experienced in installing and running various Oozie workflows and automating parallel job executions.
- Experience in running shell scripts using Hadoop Streaming.
- Experience in HBase cluster configuration, deployment and troubleshooting.
- Assisted Deployment team in setting up Hadoop cluster and services.
- Good experience in generating Statistics/Extracts/Reports from the Hadoop.
- Good understanding of NoSQL Data bases like MongoDB and Cassandra.
- Experience in managing Hadoop clusters and services using Cloudera Manager.
- Strong experience in core Java, J2EE, SQL and RESTful web services.
- Having good knowledge in Benchmarking & Performance Tuning of cluster.
- Experienced in Identifying improvement areas for systems stability and providing end end high availability architectural solutions.
- Extensive experience in developing applications using Core Java and multi - threading.
- Determined, committed and hardworking individual with strong communication, interpersonal and organizational skills.
TECHNICAL SKILLS:
Hadoop Core: HDFS, MapReduce, YARN.
Hadoop Ecosystem: Hive, Pig, HBase, Impala, Zookeeper, Sqoop, Flume, Oozie, Spark, Avro, Parquet.
Web Technologies: HTML, XML, JDBC, JSP, JavaScript.
RDBMS: Oracle 10g, MySQL, SQL server, Teradata.
No SQL: HBase, Cassandra, MongoDB.
Web/Application servers: Tomcat, LDAP.
Java frameworks: Struts, Spring, Hibernate.
Methodologies: Agile, UML, Design Patterns (Core Java and J2EE).
Data Bases: Oracle 10g, Teradata, DB2, MS-SQL Server, MySQL, MS-Access.
Programming Languages: C, C++, Java, J2EE, SQL, PL/SQL, Linux shell scripts, Python.
Tools: Used: Eclipse, Putty, Cygwin, MS Office, Crystal Reports.
PROFESSIONAL EXPERIENCE:
Confidential, Providence, RI
Hadoop Developer
Responsibilities:
- Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs in java for data cleaning and cessing.
- Developed data pipeline using Flume, Sqoop, Pig and Java MapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
- Involved in launching and Setup of HADOOP/ HBASE Cluster which includes configuring different components of HADOOP and HBASE Cluster.
- Installed and configured Cloudera Hadoop on a 100 node cluster.
- Implemented the workflows using Apache Oozie framework to automate tasks.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Applied MapReduce frameworkjobs in java for data processing by installing and configuring Hadoop, HDFS.
- Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Created Hive External tables and loaded the data in to tables and query data using HQL.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Responsible for architecting Hadoop clusters with CDH3.
- Involved in writing Flume and Hive scripts to extract, transform and load the data into Database.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Created HBase tables to store various data formats of PII data coming from different portfolios.
- Performed cluster co-ordination through Zookeeper.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Installed and configured Hive and also written Hive UDFs.
- Involved in HDFS maintenance and WEBUI it through Hadoop-Java API.
- Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL database and Sqoop.
- Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
- Developed shell script to pull the data from third party system’s into Hadoop file system.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
Environment: Hadoop, MapReduce, HDFS, Hive, Java, data Flat files, Oracle 10g, PL/SQL, SQL*PLUS, Windows NT, UNIX Shell Scripting.
Confidential, Atlanta, GA
Hadoop Developer
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS. Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Experienced in installing, configuring and using Hadoop Ecosystem components.
- Worked on automating importing and exporting jobs into HDFS and Hive using Sqoop from relational databases like Oracle and Teradata.
- Knowledge in performance troubleshooting and tuning Hadoop clusters.
- Participated in development/implementation of Cloudera Hadoop environment.
- Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
- Experience in configuring HBase cluster and troubleshooting using Cloudera Manager.
- Performed distributed transactional queueing on HBase.
- Experienced in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.
- Written PIG and Hive UDFs and used MapReduce and Junit for unit testing.
- Experienced in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
- Implemented Cassandra column oriented NoSQL database and associated RESTful web service that persists high volume data for Vertical teams.
- Experienced in managing and reviewing Hadoop log files.
- Worked in installing cluster, commissioning & decommissioning of Datanodes, Namenode recovery, capacity planning, and slots configuration.
- Supported MapReduce Programs those are running on the cluster. Involved in loading data from UNIX file system to HDFS.
- Experienced in loading and transformation of large sets of structured, semi structured and unstructured data.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Environment: Java, J2EE, JSP, JavaScript, MVC, Servlet, Struts, PL/SQL, XML, UML, JUnit, ANT, Perl, UNIX, Hadoop-2.2 (HDFS, Mapreduce), Hbase-0.98, Flume-1.5, Kafka-2.8, Oozie-4.1, Pig-0.11.
Confidential, Boston, MA
Hadoop Consultant
Responsibilities:
- Developed MapReduce programs using Hbase Client API
- Putting data into Hbase and Getting data from Hbase using spring HbaseTemplate
- Configure the sources, sink groups, sinks and channels in flume.conf file
- Implemented kafka custom sink, kafka custom producers and kafka custom consumers to integrate flume with kafka
- Implemeted Hbase filters in to get the required data from Hbase
- Developed PIG script for analyzing the raw events data.
- Cleaning unwanted files from HDFS and giving permissions to specific files in HDFS and setting replication factor for specific files
- Development of Unix Shell Scripts, to prepare Environment for Application and to delete all the temporary files.
- Configured Oozie workflow to run the jobs to store the minute wise analysis in Hbase
- I involved in Admin tasks like Datanode, Namenode namespace conflicts. Datanode Permission issues, Kernal level configurations and involved in installations and configurations, Hortonworks Ambari Dashboard management.
- Doing peer code reviews
- Implemented logging using LOG4J.
- Developed Unit test cases for testing all components.
- Conducted Knowledge sharing sessions across teams
- Assigning work to team members and coordinate with them.
Environment: Java, J2EE, JSP, JavaScript, MVC, Servlet, Struts, PL/SQL, XML, UML, JUnit, ANT, Perl, UNIX, Hadoop-2.2 (HDFS, Mapreduce), Hbase-0.98, Flume-1.5, Kafka-2.8, Oozie-4.1, Pig-0.11, Spring-4.2(IOC, JDBC, MVC) with Hbase integration
Confidential, PA
Java Developer
Responsibilities:
- Involved in System Analysis and Design methodology as well as Object Oriented Design and development using OOA/OOD methodology to capture and model business requirements.
- Proficient in doing Object Oriented Design using UML Rational Rose.
- Created Technical Design Documentation (TDD) based on the Business Specifications.
- Created JSP pages with Struts Tags and JSTL.
- Developed UI using HTML, JavaScript, CSS and JSP for interactive cross browser functionality and complex user interface.
- Implemented the web based application following the MVC II architecture using Struts framework.
- Used XML DOM API for parsing XML.
- Developed Scripts for automation of productions tasks using Perl, UNIX scripts.
- Used ANT for compilation and building JAR, WAR and EAR files.
- Used JUnit for the unit testing of various modules.
- Coordinated with other Development teams, System managers and Web Master and developed good working environment.
Environment: Java, J2EE, JSP, JavaScript, MVC, Servlet, Struts, PL/SQL, XML, UML, JUnit, ANT, Perl, UNIX.
Confidential
Associate Java Developer
Responsibilities:
- Developed Graphical User Interfaces using Java EE
- Generated PLSQL scripts and queries using SQL Developer.
- Prepared the requirement analysis documents.
- Developed the code as per the requirements in Java.
- Conducted Knowledge sharing sessions across teams.
Environment: Java, JSP, Servlets, JDBC, Oracle, HTML/DHTML, Microsoft FrontPage, Java Script 1.3, PL/SQL,J2EE, HP-UX (production), Windows XP (Development)
Confidential
Associate Java Developer
Responsibilities:
- Coordinated with the users to gather and analyze the business requirements.
- Design & Development of design specifications using design patterns and OO methodology using UML (Rational Rose)
- Involved in Use Case analysis and developing User Interface using HTML/DHTML.
- Involved in the Development and Deployment of Java beans.
- Developed dynamic page designing using JSP to invoke Servlets (Controllers)
- Developed JDBC Connection pooling to optimize database connections
- Wrote different stored procedures in Oracle using PL/SQL
- Used Java Script for client side validations
- Implemented Session Tracking and User Authentication
Environment: Java, JSP, Servlets, JDBC, JavaBeans, Oracle, HTML/DHTML, Microsoft FrontPage, Java Script 1.3, PL/SQL, Tomcat 4.0, Windows NT.