Java / Hadoop Developer Resume
Pottsville, PA
PROFESSIONAL SUMMARY:
- Highly Confident and Skilled Professional with having 8+ years of professional experience in IT industry, with around 5 years of hands - on expertise in Big Data processing using Hadoop, Hadoop Ecosystem implementation, maintenance, ETL and Big Data analysis operations.
- Over 4+ years of comprehensive experience in Big Data processing using Apache Hadoopand its ecosystem (Map Reduce, Pig, Hive, Sqoop, Flume and Hbase).
- Experience in installing, configuring and maintaining the HadoopCluster
- Knowledge of administrative tasks such as installing Hadoop (on Ubuntu) and its ecosystem components such as Hive, Pig, Sqoop.
- Have good expertise knowledge on Elastic Search, Spark Streaming.
- Good knowledge about YARN configuration.
- Expertise in writing Hadoop Jobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom Map Reduce programs in Java.
- Wrote Hive queries for data analysis to meet the requirements
- Created Hive tables to store data into HDFS and processed data using Hive QL
- Expert in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the Hive QL queries.
- Good knowledge in creating Custom Serves in Hive
- Developed Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT, UNION, SPLIT to extract data from data files to load into HDFS
- Extending Hive and Pig core functionality by writing custom UDFs
- Provided support in design and build end-to-end framework for Data Acquisition Layer, ETL Transformer Layer for Data Mart / Operational Data Store (OLTP & OLAP) and Data Provisioning Layer to Consumers / Services.
- Experience in using ZooKeeper distributed coordination service for High-Availability.
- Experience in migrating Data from RDMS to HDFS and Hive using Sqoop and converting SQL to HQL (Hive Query Language), UDF, scheduling Oozie jobs.
- Experience in writing Map Reduce programs and using Apache Hadoop API for analyzing the data.
- Involved in the Ingestion of data from various Databases like TERADATA( Sales Data Warehouse), Oracle, DB2, SQL-Server using Sqoop.
- Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Spark, Impala, Cassandra with Hortonworks Distribution.
- Developed Unit test cases using Junit, Easy Mock and MRUnit testing frameworks.
- Experience in working with Map Reduce programs using Apache Hadoop for working with Big Data
- Good knowledge in Linux shells scripting or shell commands.
- Hands on experience in dealing with Compression Codec's like Snappy, Gzip.
- Good understanding of Data Mining and Machine Learning techniques
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
- Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS
- Good knowledge in programming Spark usingScala.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts
- Extensive experience with SQL, PL/SQL and database concepts
- Also used Hbase in accordance with PIG/Hive as and when required for real time low latency queries.
- Knowledge of job workflow scheduling and monitoring tools like Oozie (hive, pig) and Zookeeper (Hbase).
- Experience in developing solutions to analyze large data sets efficiently
- Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP
- Expertise in Waterfall and Agile software development model & project planning using Microsoft Project Planner and JIRA.
- Strong experience as a senior Java Developerin Web/intranet, Client/Server technologies using Java, J2EE, Servlet, JSP, EJB, JDBC.
- Ability to work in high-pressure environments delivering to and managing stakeholder expectations.
- Application of structured methods to: Project Scoping and Planning, risks, issues, schedules and deliverables.
TECHNICAL SKILLS:
Hadoop Technologies: Apache Hadoop, Cloud era Hadoop Distribution (HDFS and Map Reduce)
Hadoop Ecosystem: Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Scala
NOSQL Databases: Hbase, MongoDB
Programming Languages: Java, C, C++, Linux shell scripting
Web Technologies: HTML, J2EE, CSS, JavaScript, AJAX, Servlet, JSP, DOM, XML
Databases: MySQL, SQL, Oracle, SQL Server
Software Engineering: UML, Object Oriented Methodologies, Scrum, Agile methodologies
Operating System: Linux, Windows 7, Windows 8, XP
IDE Tools: Eclipse, Rational rose
PROFESSIONAL EXPERIENCE:
Confidential, Pottsville, PA
Java / Hadoop Developer
Responsibilities:
- Involved in analyzing data coming from various sources and creating Meta-files and control files to ingest the data in to the Data Lake.
- Involved in configuring batch job to perform ingestion of the source files in to the Data Lake.
- Created several jobs in Talend ETL tool to perform transformation on source files.
- Used Pig to do the transformation of the data that were in the HDFS to fit the requirements.
- Developed the Java code to transform the incoming files to the required file formats.
- Created several Pig UDFs for the enrichment engine those were used to perform enrichment on the data.
- Worked extensively on Hive to create, alter and drop tables and involved in writing hive queries.
- Created and altered Hbase tables on top of data residing in Data Lake.
- Extracted and updated the data into HDFS using sqoop import and export command line utility interface.
- Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Spark, Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, Flume, Impala, Cassandra with Hortonworks Distribution.
- Involved in creating Hive tables, loading with data and writing hive queries.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Importing and exporting Data from MySQL/Oracle to HiveQL Using SQOOP.
- Designed and Developed table engine frameworks on Talend using Hadoop tools such as HDFS, Hive, Hbase Mapreduce.
- Hands on experience with HTML, CSS, Java Script, JAVA, AJAX.
- Extensively used Pig scripts for data cleansing and optimization.
- Used different components in Talend like tMap, tFileOutputDelimited, tFlowToIterate, tLogcatcher, tNormalise, tFilelist, tHdfsinput, tUnique, tfilterRow, thiveload, tflowmetercatcher.
- Experience on Talend components like transformation, file processing, java components, UNIX, DB related and logging framework.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
- Created custom UDF java routines in Talend using core java.
- Worked on Hbase API to connect to Hbase and perform filters on tables in Talend.
- Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project inscala.
- Used Angular JavaScript for client side validation.
- Experties on Talend Big Data Integration Suite for Designing and development on ETL/Big data for enterprise Talend Projects.
- Performed benchmarks on the table engine with various size of data sets.
- Worked closely with System Analyst and Architects to design and develop Talend jobs to fit the business requirement.
- Involved in loading data from UNIX file system to HDFS.
- Used Sqoop to import data from Oracle database to HDFS cluster using custom scripts.
- Worked closely with scrum master and various scrum teams to gather information and perform daily activities.
- Was involved in two PI Planning which involved 8 scrum teams
ENVIRONMENT: Hadoop, Map Reduce, Yarn, Hive, Pig, Hbase, Sqoop, MapR, Talend, ETL, Core Java, Eclipse, SQL Server, MYSQL, Linux
Confidential, Atlanta, GA
Senior Hadoop Developer
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in writing Map Reduce jobs.
- Involved in SQOOP, HDFS Put or Copy from Local to ingest data.
- Used Pig to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
- Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files.
- Experience in web application development using JAVA, J2EE technologies.
- Experienced in migrating HiveQL into Impala to minimize query response time.
- Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Exported the result set from Hive to MySQL using Shell scripts.
- Configured Hive using shared meta-store in MySQL and used Sqoop to migrate data into External Hive Tables from different RDBMS sources (Oracle, Teradata and DB2) for Data warehousing.
- Provided the necessary support to the ETL team when required.
- Performed extensive Data Mining applications using HIVE.
- Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
- Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
- Involved in developing Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
- Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
- Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics that can statistically signify the success of the experiment.
- Used Eclipse and ant to build the application.
- Involved in using SQOOP for importing and exporting data into HDFS and Hive.
- Involved in processing ingested raw data using Map Reduce, Apache Pig and Hive.
- Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
- Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.
- Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or Copy To Local.
- Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and Map Reduce) and move the data files within and outside of HDFS.
ENVIRONMENT: Hadoop, Map Reduce, Yarn, Hive, Pig, Hbase, Oozie, Sqoop, Flume, Talend, ETL, Oracle 11g, Core Java, Cloud era HDFS, Eclipse.
Confidential, Boca Raton, FL
Hadoop Developer
Responsibilities:
- Responsible for coding Map Reduce program, Hive queries, testing and debugging the Map Reduce programs.
- Responsible for Installing, Configuring and Managing of HadoopCluster spanning multiple racks.
- Developed Pig Latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
- Used Sqoop tool to extract data from a relational database into Hadoop.
- Involved in performance enhancements of the code and optimization by writing custom comparators and combiner logic.
- Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
- Good understanding of job schedulers like Fair Scheduler which assigns resources to jobs such that all jobs get, on average, an equal share of resources over time and an idea about Capacity Scheduler.
- Developed the presentation layer using JSP, HTML, CSS and client side validations using JavaScript.
- Collaborated with the ETL/ Informatica team to determine the necessary data models and UI designs to support Cognos reports.
- Eclipse for application development in Java J2EE, JBOSS as the application server, and Node JS for standalone UI testing, Oracle as the backend, GIT as the version control and ANT for build script.
- Involved in coding, code reviews, JUnit testing, Prepared and executed Unit Test Cases.
- Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run Map Reduce jobs in the backend.
- Involved in identifying possible ways to improve the efficiency of the system. Involved in the requirement analysis, design, development and Unit Testing use of MRUnit and Junit.
- Prepare daily and weekly project status report and share it with the client.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
ENVIRONMENT: Apache Hadoop, Java (JDK 1.7), Oracle, My SQL, Hive, Pig, Sqoop, Linux, Cent OS, Junit, MR Unit, Cloud era
Confidential, Denver, CO
Java Developer / HadoopDeveloper
Responsibilities:
- Experience in administration, installing, upgrading and managing CDH3, Pig, Hive & Hbase
- Architecture and implementation of the Product Platform as well as all data transfer, storage and Processing from Data Center and to HadoopFile Systems
- Experienced in defining job flows.
- Implemented CDH3 Hadoopcluster on CentOS.
- Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
- Wrote Custom Map Reduce Scripts for Data Processing in Java
- Importing and exporting data into HDFS and Hive using Sqoop and also used flume from to extract from multiple resources.
- Responsible to manage data coming from different sources.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Created Hive tables to store data into HDFS, loading data and writing hive queries that will run internally in map reduce way.
- Used Flume to Channel data from different sources to HDFS
- Created HBase tables to store variable data formats of PII data coming from different portfolios
- Implemented best income logic using Pig scripts. Wrote custom Pig UDF to analyze data
- Load and transform large sets of structured, semi structured and unstructured data
- Cluster coordination services through Zookeeper
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
ENVIRONMENT: Hadoop, Map Reduce, Hive, Hbase, Flume, Pig, Zookeeper, Java, ETL, SQL, Centos, Eclipse.
Confidential, Rochester, MN
Java Developer
Responsibilities:
- Involved in Analysis, design and coding on J2EE Environment.
- Implemented MVC architecture using Struts, JSP, and EJB's.
- Used Core Javaconcepts in application such as multithreaded programming, synchronization of threads used thread wait, notify, join methods etc.
- Presentation layer design and programming on HTML, XML, XSL, JSP, JSTL and Ajax.
- Creating cross-browser compatible and standards-compliant CSS-based page layouts.
- Worked on Hibernate object/relational mapping according to database schema.
- Designed, developed and implemented the business logic required for Security presentation controller.
- Used JSP, Servlet coding under J2EE Environment.
- Designed XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
- Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
- Developed Web Services for data transfer from client to server and vice versa using Apache Axis, SOAP and WSDL.
- Involved in designing of class and dataflow diagrams using UML Rational Rose.
- Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using PL/SQL.
- Involved in the Design of ERD (Entity Relationship Diagrams) for Relational database.
- Developed Shell scripts in UNIX and procedures using SQL and PL/SQL to process the data from the input file and load into the database.
- Used CVS for maintaining the Source Code Designed, developed and deployed on WebLogic Server.
- Performed Unit Testing on the applications that are developed.
ENVIRONMENT: Java(JDK 1.6), J2EE, JSP, Servlet, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose, SOAP, Web Logic Server, JUnit, PL/SQL, CSS, HTML, XML, Eclipse
Confidential, New York, NY
Java Developer
Responsibilities:
- Actively participated in requirements gathering, analysis, design, and testing phases.
- Designed use case diagrams, class diagrams, and sequence diagrams as a part of Design Phase.
- Developed the entire application implementing MVC Architecture integrating JSF with Hibernate and spring frameworks.
- Implemented various J2EE Design patterns like Singleton, Service Locator, DAO, and SOA.
- Worked on AJAX to develop an interactive Web Application and JavaScript for Data Validations.
- Design and Developed using Web Service using Apache Axis wrote numerous session and message driven beans for operation on JBoss and WebLogic
- Developed the Enterprise Java Beans (Stateless Session beans) to handle different transactions such as online funds transfer, bill payments to the service providers.
- Worked with various types of controllers like simple form controller, Abstract Controller and Controller Interface etc.
- Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
- Developed XML documents and generated XSL files for Payment Transaction and Reserve Transaction systems.
- Developed coded, tested, debugged and deployed JSPs and Servlet for the input and output forms on the web browsers.
- Database Modification using SQL, PL/SQL, Stored procedures, triggers, Views in Oracle.
- Used JUnit Framework for the unit testing of all the java classes.
ENVIRONMENT: J2EE, JDBC, Servlet, JSP, Struts, Hibernate, Web services, MVC, HTML, JavaScript, Web Logic, XML, JUnit, Oracle, Web Sphere, Eclipse
Confidential
Java Developer
Responsibilities:
- Designed use cases for different scenarios.
- Involved in acquiring requirements from the clients.
- Developed functional code and met expected requirements.
- Wrote product technical documentation as necessary.
- Designed presentation part in JSP(Dynamic content) and HTML(for static pages)
- Designed Business logic in EJB and Business facades.
- Used Resource Manager to schedule the job in UNIX server.
- Wrote numerous session and message driven beans for operation on JBoss and WebLogic
- Apache Tomcat Server was used to deploy the application.
- Involving in Building the modules in Linux environment with ant script.
- Used MDBs (JMS) and MQ Series for Account information exchange between current and legacy system.
- Attached an SMTP server to the system, which handles Dynamic E-Mail Dispatches.
- Created Connection pools and Data Sources.
- Involved in the Enhancements of Data Base tables and procedures.
- Deployed this application, which uses J2EE architecture model and Struts Framework first on
- WebLogic and helped in migrating to JBoss Application server.
- Participated in code reviews and optimization of code.
- Followed Change Control Process by utilizing CVS Version Manager.
ENVIRONMENT: J2EE, JSP, HTML, Struts Frame Work, EJB, JMS, Web Logic Server, JBoss Server, PL/SQL, CVS, MS PowerPoint, MS Outlook