Sr Hadoop Developer Resume Cary, NC - Hire IT People

SUMMARY:

Over 7+ years of experience in Analysis, Architectural Design, Prototyping, Development, Integration and Testing of application using Java/J2EE Technologies
Over 3+ years of experience with Hadoop Ecosystem including HDFS, MapReduce, Hive, Pig, Storm, Kafka,YARN, HBase, Oozie, ZooKeeper, Flume and Sqoop based Big Data Platforms
Expertise in design and implementation of Big Data solutionsin Banking, Retail and E - commerce domains
Expereienced with NoSQL databases like HBase, Cassandra and MongoDB
Comprehensive experience in building Web-based applications using J2EE Frame works like Spring, Hibernate, EJB, Struts and JMS
Excellent ability to use analytical tools to mine data and evaluate the underlying patterns
Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Managing and Reviewing data backups and log files MapReduce
Hands on experience in developing MapReduce programs using Apache Hadoop for analyzing the Big Data
Expertise in optimizing traffic across network using Combiners, joining multiple schema datasets using Joins and organizing data using Partitioners
Experience in writing Custom Counters for analysing the data and testing using MRUnit framework
Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml and Avro
Expertise in composing MapReduce Pipelines with many user-defined functions using Apache CrunchPIG
Expertise in writing ad-hoc MapReduce programs using Pig Scripts
Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
Implemented business logic by writing Pig Latin UDFs in Java and used various UDFs from Piggybanks and other sourcesHIVE
Expertise in Hive Query Language(HiveQL), Hive Security and debugging Hive issues
Responsible for performing extensive data validation using HIVE Dynamic Partitioning and Bucketing
Experience in developing custom UDFs for Pig and Hive to in corporate methods and functionality of Python/Java into Pig Latin and HQL (Hive QL)
Worked on different set of tables like External Tables and Managed Tables
Experiences with working different Hive SerDe's that handle file formats like avro, xml
Analyzed the data by performing Hive queries and used HIVE UDFs for complex queryingNoSQL
Expert database engineer; NoSQL and relational data modeling
Responsible for building scalable distributed data solutions using DatastaxCassandra
Expertise in HBase Cluster Setup, Configurations, HBase Implementation and HBase Client API
Worked on importing data into HBase using HBase Shell and HBase Client API
Expertise in performing large-scale web crawling with Apache Nutch using a Hadoop/HBase clusterJava/J2EE
Expertise in several J2EE technologies like JDBC, Servlets, JSP,Struts, Spring, Hibernate, JPA, JSF, EJB, JMS, JAX-WS, SOAP, JQuery, AJAX, XML, JSON, HTML5/HTML, XHTML, Maven, and Ant
Expert knowledge over J2EE Design Patterns like MVC Architecture, Front Controller, Session Facade, Business Delegate and Data Access Object for building J2EE Applications
Thorough knowledge on JAX-WS to access the external Web Services, get the xml response and convert it back to java objects
Experience in using Jenkins for Continuous Integration and Sonar jobs for java code quality
Extensive experience in developing Internet and Intranet related applications using J2EE, Servlets, JSP, Jboss, WebLogic, Tomcat, and Struts Frame WorkSQL, Script & Oracle Database
Extensive experience with database DB2, Oracle9i/10g/11g (Database Design, and SQL Queries)
Good experience in SQL, PL/SQL, Perl Scripting, Shell Scripting, Partitioning, Data modeling, OLAP, Logical and Physical Database Design, Backup and Recovery procedures
Experienced with build tool Maven, Ant and continuous integrations like Jenkins
Developed Unit test cases using JUnit, Easy Mock and MRUnit testing frameworks
Experienced in Agile SCRUM, RUP (Rational Unified Process) and TDD (Test Driven Development) software development methodologies

TECHNICAL SKILLS:

Hadoop/Big Data/NoSql Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, Storm, Kafka, YARN, Crunch, Zookeeper, HBase, Cassandra

Programming Languages: Java (JDK 5/JDK 6), Python, C, SQL, PL/SQL, Shell Script

IDE Tools: Eclipse, Rational Team Concert, NetBeans

Framework: Hibernate, Spring, Struts, JMS, EJB, JUnit, MRUnit, JAXB

Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, Rest Web Services

Application Servers: Jboss, Tomcat, Web Logic, Web Sphere

Databases: Oracle 11g/10g/9i, MySQL, DB2, Derby, MS-SQL Server

Operating Systems: UNIX, Windows, LINUX

Build Tools: Jenkins, Maven, ANT

Reporting Tools: Jasper Reports, iReport

PROFESSIONAL EXPERIENCE:

Confidential, Cary, NC

Sr Hadoop Developer

Responsibilities:

Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats
Developed MapReduce programs that filter bad and un-necessary claim records and find out unique records based on account type
Processed semi, unstructured data using Map Reduce programs
Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie co-ordinator jobs
Implemented custom DataTypes, InputFormat, RecordReader, OutputFormat, RecordWriter for MapReduce computations
Successfully migrated Legacy application to Big Data application using Hive/Pig/HBase in Production level
Transformed date related data into application compatible format by developing apache Pig UDFs
DevelopedMapReducepipeline for feature extractionand tested the modules using MRUnit
Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms
Creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduceway
Responsible for performing extensive data validation using Hive
Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access
Worked on different set of tables like External Tables and Managed Tables
Used Oozie workflow engine to run multiple Hive and Pig jobs
Involved in installing and configuring Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
Involved in designing and developing non-trivial ETL processes within Hadoop using tools likePig, Sqoop, Flume, and Oozie
Used DML statements to perform different operations on Hive Tables
Developed Hive queries for creating foundation tables from stage data
Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior
Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
Working with Apache Crunch library to write, test and run HADOOP MapReduce pipeline jobs
Involved in joining and data aggregation using Apache Crunch
Worked with Sqoop to export analyzed data from HDFS environment into RDBMS for report generation and visualization purpose
Involved in writing, testing, and running MapReduce pipelines using Apache Crunch
Queried and analyzed data from DatastaxCassandrafor quick searching, sorting and grouping
Developed Mapping document for reporting tools

Environment: Apache Hadoop, HDFS, MapReduce, Apache Crunch, Java (jdk1.6), MySQL, DB Visualizer, Linux, Sqoop, Apache Hive, Apache Pig

Confidential, Northbrook, IL

Hadoop Developer

Responsibilities:

Installed, configured, and maintained Apache Hadoop clusters for application development and major components of Hadoop Ecosystem: Hive, Pig, HBase, Sqoop, Flume, Oozie and Zookeeper
Implemented six nodes CDH4 Hadoop Cluster on CentOS
Importing and exporting data into HDFS and Hive from different RDBMS using Sqoop
Experienced in defining job flows to run multiple MapReduce and Pig jobs using Oozie
Importing log files using Flume into HDFS and load into Hive tables to query data
Used HBase-Hive integration, written multiple Hive UDFs for complex queries
Involved in writing APIs to ReadHBasetables, cleanse data and write to anotherHBasetable
Created multiple Hive tables, implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access
Responsible for architecting Hadoop clusters with CDH4 on CentOS, managing with Cloudera Manager
Written multiple MapReduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats
Experience working with Apache SOLR for indexing and querying
Knowledge on Zookeeper internals
Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements
Experienced in writing programs using HBase Client API
Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop
Experienced in design, development, tuning and maintenance of NoSQL database
Written MapReduce program in Python with the Hadoop streaming API
Developed unit test cases for Hadoop MapReduce jobs with MRUnit
Excellent experience in ETL analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of database
Experience in using Pentaho Data Integration tool for data integration, OLAP analysis and ETL process
Experience integrating R with Hadoop with RHadoop for statistical analysis and predictive modelling

Environment: Apache Hadoop1.0, Hive, Pig, HBase, Sqoop, Flume, Java, Linux, MySQL Server5.155, MS SQL Server 2012, SQL, PL/SQL, SQL Server Data Tools, SQL Server Business Intelligence Development Studio (SSAS, SSIS, SSRS), R

Confidential, Woodcliff Lake, NJ

Hadoop Developer

Responsibilities:

This is an initiative from Citibank to move its disconnected legacy billing & monitoring systems to a consolidated platform
Consolidating customer data from Lending, Insurance, Trading and Billing systems into warehouse and mart subsequently for business intelligence reporting
Providing improved revenue capture through leakage elimination, Assessing risk score more accurately in its customer portfolios, having better exposure and to offer each customer better products and advice
Experience working with JIRA for project management, GIT for source code management, JENKINS for continuous integration and Crucible for code reviews
Transferring and exporting data into HDFS and Hive, MySQL and DB2 using Sqoop
Implementing MapReduce programs to analyze large datasets in warehouse for business intelligence purpose
Used default MapReduce Input and Output Formats
Developed HQL queries to implement the select, insert, update and operations to the database by creating HQL named queries
Based on user usage identifying the customer potentiality using MapReduce Programs.
Using Oozie in your environment to schedule Hadoop jobs and would like to call Sqoop from within your existing workflows
Automatically Importing data regular basis using sqoop to into the Hive partition by using apache Oozie
Clustering customers category based on that providing offers using Apache Hive
Experienced in managing and reviewing Hadoop log files Load and transform large sets of data
Grouping, Aggregation and Sorting are done by using Pig and Hive which are higher-level abstractions of MapReduce
Conducted data extraction that may include analyzing, reviewing, modeling based on requirements using higher Level Tools such as Hive and Pig
Supported MapReduce Programs those are running on the cluster
Involved in creating Hive tables, loading with data and writing Hive queries
Created data-models for customer data using theCassandraQuery Language
Ran many performance tests using theCassandra-stress tool in order to measure and improve the read and write performance of the cluster
Queried and analyzed data from DatastaxCassandrafor quick searching, sorting and grouping

Environment: Apache Hadoop (Cloudera), Java (jdk1.6), Teradata, Redhat Linux, Sqoop, Hive, DB visualizer, Oozie

Confidential, San Antonio, TX

J2EE Software Developer

Responsibilities:

Application was developed using the Struts MVC architecture
Developed action and form classes based on Struts framework to handle the pages
Developed a web-based reporting for credit monitoring system with HTML5, XHTML, JSTL, custom tags and Tiles using Struts framework
Developed Servlets and JSPs based on MVC pattern using Struts framework and Spring Framework
Developed web-based customer management software using Facelets, Icefaces and JSF
Implemented Ajax Frame works, jQuery tools examples like Auto Completer, Tab Module, and Calendar and Floating windows
Configured Struts-Config file for form-beans, global forwards, error forwards and action forwards
Designed and implemented Report Module (using Jasper Report framework)
Created several JSP’s and populated them with data from database
Developed Message-Driven beans in collaboration with Java Messaging Service (JMS)
Developing Web Services using Apache Axis 2 to retrieve data from legacy systems
Developed Servlets, Action classes, Action Form classes and configured the struts-config.xml file
Used XML parser APIs such as JAXP and JAXB in the web service's request response data marshalling as well as unmarshalling process
Developed UI components for email and link sharing of documents and files for a Content Management System using BackBone.js and JQuery
Planned and implemented various SQL, Stored Procedure, and triggers
Used Hibernate to access My SQL database and implemented of connection pooling
Developed JavaScript based components using Ext JS framework like GRID, Tree Panel with client reports customized according to user requirements
Performed building and deployment of WAR, JAR files on test, stage, and production systems in Apache Tomcat application server
Used ANT for the build process

Environment: J2EE, Java 1.4.2, Servlets, JSP, JDBC, EJB 3, JMS, JQuery, backbone.js, HTML5, JSTL, Icefaces, XML, Spring, Struts, Hibernate, Web Services, Apache Tomcat Server, JSF, EXT JS, JAXB, Jasper Report, JUnit, SOAP, SOAPUI, XML, JavaScript, UML, Apache Axis 2, ANT, SVN, MySQL

Confidential

Java Developer

Responsibilities:

Worked on Requirement analysis, gathered all possible business requirements from end users and business Analysts
Involved in creation of UML diagrams like Class, Activity, and Sequence Diagrams using modelling tools of IBM Rational Rose
Worked with coreJavacode extensively using interfaces and multi-threading techniques
Involved in production support and documenting the application to provide and knowledge transfer to the user
Used Log4j for logging mechanism and developed wrapper classes to configure the logs
Used JUnit and Test cases for testing the application modules
Developed and configured theJavabeans using SpringMVC framework
Developed the application using Rational Team Concert and worked under Agile Environment
Developed SQL stored procedures and prepared statements for updating and accessing data from database
Conducted the SQL performance analysis on Oracle 9i database tables and improved the performance by SQL tuning
Also used C++ to create some libraries used in the application

Environment: C++,Java, JDBC, Servlets, JSP, Struts, Eclipse, Oracle 9i, Apache Tomcat, CVS, JavaScript, Log4J

We provide IT Staff Augmentation Services!

Sr Hadoop Developer Resume

Cary, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship