Java/ Big Data Hadoop Engineer Resume Canton, MI - Hire IT People

PROFESSIONAL SUMMARY:

Hadoop Developer with around 8 years of experience in Information Technology& 4+ years in Hadoop Ecosystem and 4 years in Java,J2EE
Expertise in Hadoop Ecosystem components HDFS, Map Reduce, Hive, Pig, Sqoop, Hbase and Flume for Data Analytics.
Capable of processing large sets of structured, semi - structured and unstructured data sets.
Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
Expertise in writing Map-Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
Experience in developing Custom UDFs for datasets in Pig and Hive.
Proficient in designing and querying the NoSQL databases like HBase.
Experience in designing and developing tables in HBase and storing aggregated data's from Hive Table.
Knowledge on integrating different eco-systems like HBase - Hive, HBase - Pig
Experience on streaming data using Apache Flume.
Good Knowledge in Apache Spark and SparkSQL.
Skilled on migrating the data from different databases to Hadoop HDFS and Hive using Sqoop.
Deep Knowledge in the core concepts of MapReduce Framework and Hadoop ecosystem.
Knowledge about Confidential manager and Ambari.
Analyzed large structured datasets using Hive's data warehousing infrastructure
Extensive knowledge of creating manages tables and external tables in Hive Eco system.
Worked extensively in design and development of business process using SQOOP, PIG, HIVE, HBASE
Knowledge on Spark framework for batch and real time data processing.
Knowledge on Scala Programming Language.
Good knowledge in Software Development Life Cycle (SDLC) and Software Testing Life Cycle (STLC).
Extensively used shell scripting.
Excellent communication and inter-personal skills detail oriented, analytical, time bound, responsible team player and ability to coordinate in a team environment and possesses high degree of self-motivation and a quick learner.

TECHNICAL SKILLS:

Big Data: Hadoop Architecture, HDFS, MapReduce, Hbase, Hive, Pig, Oozie, Sqoop,StreamSets Spark, Spark Streaming, Kafka, Strom.

Java technologies: JDK, Hibernate, Structs, Servlets, JSP,JSTL, JDBC, Applets, Multi-Threading, Log4j.

Web Technologies: HTML, CSS, AJAX, Javascript, AngularJS, jQuery, XML.

Web services: SOAP, REST, WSDL.

Web Servers: BEA Web logic, Tomcat, Websphere

SDLC Methodologies: Agile, Waterfall, Scrum, TDD, OOAD.

Modeling Tools: UML, VISIO.

Testing Tools: Junit

IDEs and Tools: Eclipse, Netbeans, JbossDev studio.

Operating Systems and other: Windows, Linux, UNIX, clear case, Putty, WinScp and File zilla.

Databases: Oracle 11g/10g/9i, MySQL, MS-SQL Server, DB2.

Database Tools: SQL Developer, DB Visualizer, SSIS

WORK EXPERIENCE:

Confidential, Canton, MI

Java/ Big Data Hadoop Engineer

Responsibilities:

Responsible for installing and configuring Hadoop MapReduce, HDFS, also developed various MapReduce jobs for data cleaning
Aggregated complex and huge data in Apache Hadoop platform using Spark execution engine.
Installed and configured Hive to create tables for the unstructured data in HDFS
Hold good expertise on major components in Hadoop Ecosystem including Hive, PIG, HBase, HBase-Hive Integration, Sqoop and Flume.
Involved in loading data from UNIX file system to HDFS
Used Infomatica for Hadoop ETL jobs.
Responsible for managing and scheduling jobs on Hadoop Cluster
Responsible for importing and exporting data into HDFS and Hive using Sqoop
Experienced in running Hadoop streaming jobs to process terabytes of xml format data
Experienced in managing Hadoop log files
Worked on managing data coming from different sources
Wrote HQL queries to create tables and loaded data from HDFS to make it structured
Load and transform large sets of structured, semi structured and unstructured data
Extensively worked on Hive for generating transforming files from different analytical formats to .txt i.e. text files enabling to view the data for further analysis
Created Hive tables, loaded them with data and wrote hive queries that run internally in MapReduce way
Used KAFKAas the messaging system.
Used Apache Storm for data streaming.
Wrote and modified store procedures enabling to load and modify data according to the project requirements
Used Confidential manager to manage Hadoop cluster.
Responsible for developing PIG Latin scripts enabling the extraction of data from the web server output files to load into HDFS
Extensively used Flume to collect the log files from the web servers and then integrated these files into HDFS
Responsible for implementing schedulers on Job Tracker enabling them to effectively use the resources available in the cluster for any given MapReduce jobs.
Constantly worked on tuning the performance of the queries in Hive and Pig, making the queries work even more powerfully in processing and retrieving the data
Supported Map Reduce Programs running on the cluster
Created external tables in Hive and loaded the data into these tables
Hands on experience in database performance tuning and data modeling
Monitored the cluster coordination using Zoo Keeper

Environment: Hadoop, HDFS, MapReduce, Horton Works, Hive, Java, Apache Strom, Apache kafka, Flat files, UNIX Shell Scripting, Spark Oracle 11g 10g, PL SQL.

Confidential, New York

Java/ Hadoop Developer

Responsibilities:

Development and ETL Design in Hadoop
Developed MapReduce Input format to read specific data format.
Developing Hive queries and UDF's as per requirement.
Used Kafka as the messaging system and Apache storm as the streaming frame work.
Involved in extracting customer's big data from various data sources into Hadoop HDFS. This included data from mainframes, databases and also logs data from servers.
Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
Used Infomatica for Data transformation.
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
The Hive tables created as per requirement were managed or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Implemented Partitioning, Bucketing in Hive for better organization of the data.
Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
Implemented Fair Scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Implemented automatic failover Zookeeper and zookeeper failover controller.
Used Sqoop to transfer data from external sources to HDFS
Designed ETL flow for several Hadoop Applications.
Designed and Developed Oozie workflows, integration with Pig.
Documented ETL best practices to be implemented with Hadoop
Monitoring and Debugging Hadoop jobs/Applications running in production.
Worked on Hadoop Confidential upgrade from CDH3 to CDH4.x.
Worked on providing user support and application support on Hadoop Infrastructure.
Worked on evaluating, comparing different tools for test data management with Hadoop.
Helping testing team on Hadoop Application testing.

Environment: Hadoop v1.2.1, HDFS, MapReduce, Hive, Kafka, Apache Storm, Sqoop, Pig, DB2, Oracle, XML, CDH4.x

Confidential, New York

Hadoop Developer

Responsibilities:

Develop JAVA MapReduce Jobs for the aggregation and interest matrix calculation for users.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
Experienced in managing and reviewing applicationlog files.
Ingest the application logs into HDFS and processes the logs using map reduce jobs.
Create and maintain Hive warehouse for Hive analysis.
Generate test cases for the new MR jobs.
Run various Hive queries on the data dumps and generate aggregated datasets for downstream systems for further analysis.
Developed dynamic partitioned Hive tables to store data by date and workflow id partition.
Use Apache Scoop to dump the user incremental data into the HDFS on a daily basis.
Run clustering and user recommendation agents on the weblogs and profiles of the users to generate the interest matrix.
Installed and configured Hive and also written Hive UDFs in java and python
Prepare the data for consumption by formatting it for upload to the UDB system.
Lead & Programmed the recommendation logic for various clustering and classification algorithms using JAVA.
Involved in migration Hadoop jobs into higher environments like SIT, UAT and Prod.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Confidential Manager, Scala, Cassandra, Pig, Sqoop, Oozie, ZooKeeper, PL/SQL, MySQL, Oozie, HBase

Confidential, San Jose, CA

Hadoop Developer

Responsibilities:

Involved in the installation and configuration of Apache Hadoop, Hive, Pig and Hbase environments.
Developed Map Reduce jobs in Java for data cleansing and preprocessing. Also implemented Combiners and Practitioners to optimize MapReduce algorithms and achieve Application performance optimization.
Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
Devised procedures that solve complex business problems with due considerations for hardware/software capacity and limitations, operating times and desired results.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Designed and Developed Hive Internal and External Tables. Developed Hive queries, UDFs to process data and generate data cubes for visualizing.
Used PIG and Hive extensively for ad-hoc querying, and creating scripts for analysts.
Used Grunt shell to execute PIG scripts.
Loaded large amount of data into HBase using Sqoop.

Environment: Java, Hadoop, HDFS, Hive, Pig, MapReduce, Hbase, Sqoop.

Confidential, San Fransico, CA

Java Developer

Responsibilities:

Involved in Requirements gathering, Requirement analysis, Design, Development, Integration and Deployment.
Involved in Tax module and Order Placement / Order Processing module.
Responsible for the design and development of the application framework
Designed and Developed UI’s using JSP by following MVC architecture.
Developed the application using Struts framework. The views are programmed using JSP pages with the struts tag library, Model is the combination of EJB’s and Java classes and web implementation controllers are Servlets.
Used EJB as a middleware in designing and developing a three-tier distributed application.
The Java Message Service (JMS) API is used to allow application components to create, send, receive, and read messages.
Used JUnit for unit testing of the system and Log4J for logging.
Created and maintained data using Oracle database and used JDBC for database connectivity.
Created and implemented Oracle stored procedures and triggers.
Installed Web Logic Server for handling HTTP Request/Response. The request and response from the client are controlled using Session Tracking in JSP.
Worked on the front-end technologies like HTML, JavaScript, CSS and JSP pages using JSTL tags.
Reported daily about the team progress to the Project Manager and Team Lead.

Environment: Core Java, J2EE 1.3, JSP 1.2, Servlets 2.3, EJB 2.0, Struts 1.1, JNDI 1.2, JDBC 2.1, Oracle 8i, UML, DAO, JMS, XML, Web Logic 7.0, MVC Design Pattern, Eclipse 2.1, Log4j and JUnit.

Confidential, Montvale, NJ

Java Developer

Responsibilities:

Worked with the business community to define business requirements and analyze the possible technical solutions.
Requirement gathering, Business Process flow, Business Process Modeling and Business Analysis.
Extensively used UML and Rational Rose for designing to develop various use cases, class diagrams and sequence diagrams.
Used JavaScript for client-side validations, and AJAX to create interactive front-end GUI.
Developed application using Spring MVC architecture.
Developed custom tags for table utility component
Used various Java, J2EE APIs including JDBC, XML, Servlets, and JSP.
Designed and implemented the UI using Java, HTML, JSP and JavaScript.
Designed and developed web pages using Servlets and JSPs and also used XML/XSL/XSLT as repository.
Involved in Java application testing and maintenance in development and production.
Involved in developing the customer form data tables. Maintaining the customer support and customer data from database tables in MySQL database.
Involved in mentoring specific projects in application of the new SDLC based on the Agile Unified Process, especially from the project management, requirements and architecture perspectives.
Designed and developed Views, Model and Controller components implementing MVC Framework.

Environment: JDK 1.3, J2EE, JDBC, Servlets, JSP, XML, XSL, CSS, HTML, DHTML, JavaScript, UML, Eclipse 3.0, Tomcat 4.1, MySQL.

Confidential

Java Developer

Responsibilities:

Implemented the project according to the Software Development Life Cycle (SDLC)
Developed the web layer using Spring MVC framework.
Implemented JDBC for mapping an object-oriented domain model to a traditional relational database.
Created Stored Procedures to manipulate the database and to apply the business logic according to the user's specifications.
Involved in analyzing, designing, implementing and testing of the project.
Developed UML diagrams like Use cases and Sequence diagrams as per requirement.
Developed the Generic Classes, which includes the frequently used functionality, so that it can be reusable.
Exception Management mechanism using Exception Handling Application Blocks to handle the exceptions.
Designed and developed user interfaces using JSP, Javascript, HTML and Struts framework.
Involved in Database design and developing SQL Queries, stored procedures on MySQL.
Developed Action Forms and Action Classes in Struts framework.
Programmed session and entity EJBs to handle user info track and profile based transactions.
Involved in writing JUnit test cases, unit and integration testing of the application.
Developed user and technical documentation.
Used CVS for maintaining the Source Code.
Logging was done through log4j.
Monitoring the failures of the site.

Environment: Core Java, J2EE 1.3, JSP 1.2, Servlets 2.3, EJB 2.0, Struts 1.1, JNDI 1.2, JDBC 2.1, Oracle 8i, UML, DAO, JMS, XML, Web Logic 7.0, MVC Design Pattern, Eclipse 2.1, Log4j and JUnit.

We provide IT Staff Augmentation Services!

Java/ Big Data Hadoop Engineer Resume

Canton, MI

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship