Java Developer Resume Raleigh, NC - Hire IT People

SUMMARY

Over 9+ years of experience spread across Hadoop, RDBMS and ETL, that includes extensive experience in HDFS, Pig, Hive, Map Reduce, Java, Sqoop, python, Redmis, KDB+/Q, Teradata, Sybase and Ab Initio.
Having 4+ experience on Big Data Analytics with hands on experience in writing Map Reduce jobs on Hadoop Ecosystem including Hive and Pig.
In depth understanding/knowledge of Hadoop Architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
Hands - on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie.
Experience in analyzing data using Hive QL, Pig Latin, and custom Map Reduce programs in Java.
Good Experience on SDLC (Software Development Life cycle).
Experience in installing and maintaining Cassandra by configuring the cassadra.yaml file as per the requirement and performed reads and writes using Java JDBC connectivity.
Hands on experience in Sequence files, RC files, Combiners, Counters, Dynamic Partitions, Bucketing for best practice and performance improvement.
Extending HIVE and PIG core functionality by using custom User Defined Function's (UDF),User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.
Experience in developing Kafka consumers and producers by extending low level and high level consumer and producer API’s.
Involved in upgrading existing MongoDB instances from version 2.4 to version 2.6 by upgrading the security roles and implementing newer features.
Expertise in writing Spark Streaming applications using Scala higher order functions.
Expertise in writing iterative algorithms using Spark MLlib for machine learning applications in Spark-shell on Scala REPL.
Experience in configuring various topologies in storm to ingest and process data on the fly from multiple sources and aggregate into central repository Hadoop.
Expertise in Commissioning and Decommissioning of nodes in the clusters, Backup configuration and Recovery from a Name node failure.
Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Managing and Reviewing data backups and log files.
Worked on disaster management with Hadoop cluster.
Experience in Database design, Entity relationships, Database analysis, Programming SQL, Stored procedure’s PL/ SQL, Packages and Triggers in Oracle and SQL Server on Windows and LINUX.
Good knowledge in tuning the performance of SQL queries and ETL process.
Also, experienced in working with tools like TOAD, SQL Server Management studio and SQL plus for development and customization.
Excellent working knowledge of HBase and data pre-processing using FLUME.

TECHNICAL SKILLS

Big data/Hadoop Ecosystem: HDFS, Map Reduce, Crunch, HIVE, PIG, Sqoop, Flume, Oozie, Spark, Kafka, Redmis. Storm, HawQ and Avro

NoSql Databases: MongoDB, Cassandra, HBase

Java / J2EE Technologies: JSE, Servlets, JSP, JDBC, XML, AJAX, SOAP, WSDL

Programming Languages: C, C++, Java, Scala, SQL, PL/SQL, Linux shell scripts, python.

Database: Oracle 11g/10g, DB2, Vertica, MySQL, Teradata

Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX, SOAP, KDB+/Q

Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2.

Tools: Used: Eclipse, IntelliJ, GIT, Putty, Winscp

Monitoring and Reporting tools: Ganglia, Nagios, Custom Shell scripts.

ETL Tools: Informatica, SAP Business Objects, Pentaho.

Testing: Hadoop MRUNIT Testing, Hive Testing, Quality Center (QC)3

Operating System: Ubuntu (Linux), Win 95/98/2000/XP, Red Hat

PROFESSIONAL EXPERIENCE

Confidential - Auburn Hills, MI

Sr. Hadoop Developer

Responsibilities:

Interacted with corresponding area's Subject Matter Experts (SME), Business Analysts and Key Business Users in gathering the Analytics / Reports Requirements.
Involved in Installing, Configuring Hadoop components using CDH 5.2 Distribution.
Involved in cluster setup including addition, Decommissioning and balancing of nodes without any effect to running jobs.
Responsible for analyzing large data sets and derive customer usage patterns by developing new Map-Reduce programs.
Written Map-Reduce code to parse the data from various sources and storing parsed data into Hbase and Hive.
Worked on creating combiners, partitions, and distributed cache to improve the performance of Map-Reduce jobs.
Skilled in developing applications in Python language for multiple platforms.
Familiarity with process and Python software development.
Designed and maintained databases using Python.
Tested and implemented applications built using Python.
Developed Shell Script to perform data profiling on the ingested data with the help of HIVE Bucketing.
Responsible for debug, optimization of Hive scripts and also implementing DE duplication logic in Hive using a rank key function (UDF).
Loaded the different types of data from Oracle DB, MySQL to Hive, Hbase, HDFS using Sqoop.
Developed Spark scripts by using Scala shell commands as per the requirement.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Developed Scala scripts, UDF's using both Data frames/SQL and RDD/Map-Reduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark YARN.
Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data and Real-time streaming data using Spark with Kafka.
Setup up data ingestion pipeline using Spark on CDH 5.2 to profile, collate, cleanse and persist data.
Implemented various machine learning techniques like Random Forest, K-Means, Logistic Regression for predictions and pattern identification using Spark-MLlib.
Used Hive scripts to compute aggregates and store them on HBase and HDFS for low latency applications.
Extensive experience with AngularJS, creating custom directives, decorators, and services to interface with both restful and legacy network services also DOM applications.
Extensive experience on modern front-end template frameworks for JavaScript including Bootstrap, JQuery, AngularJS etc.
Deployed Puppet for automated management of machine configurations.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed workflow in Oozie to automate the tasks of loading data into HDFS and pre-processing with Pig and Hive.
Used Pig as ETL tool to do transformations, joins and some pre-aggregations before storing the data into HDFS.
Imported all the customer specific personal data to Hadoop using Sqoop component from Massive parallel processing (MPP) engine like Netezza database.
Used Impala to read, write and query the Hadoop data in HDFS and Hbase.
Experienced in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.
Worked with BI teams in generating the reports and designing ETL workflows on Tableau.

Environment: JDK1.6,Red Hat Linux, CDH 5.2, Hive, Pig, Cassandra, Sqoop, Flume, Zookeeper, Spark, Scala, Kafka, Spark Sql, Spark Streaming, Spark-MLlib Oozie, Storm, DB2, EC2, Oracle 11g, SQL Server 2008, Hbase, Cloudera Manager, GIT, KDB+/Q.

Confidential - Raleigh, NC

Sr. Hadoop developer

Responsibilities:

Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in Java for data
Experience in Terra form and cloud formation.
Wrote Map Reduce programs.
Defining workflow using Oozie framework for automation.
Implemented Flume (Multiplexing) to steam data from upstream pipes in to HDFS.
Responsible for reviewing Hadoop log files.
Loading and transforming large sets of unstructured and semi structured data.
Performed data completeness, correctness, data transformation and data quality testing using SQL.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Worked on platforms like SSIS, SSAS, SSRS.
Implementation of Hive partition (static and dynamic) and bucketing.
Handled importing of data from various data sources, performed transformations using Hive, Map Reduce and loaded data into HDFS.
Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark YARN.
Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data and Real-time streaming data using Spark with Kafka.
Setup up data ingestion pipeline using Spark on CDH 5.2 to profile, collate, cleanse and persist data.
Assisted in creation of ETL processes for transformation of data sources from existing RDBMS systems.
Upgraded Python 2.3 to Python 2.5 on a RHEL 4 server, this required recompiling mod python to use Python 2.5.
This upgrade was necessary because in lined models with UTF-8 characters were causing unexpected errors.
Developed profile/log interceptors for the struts action classes using Struts Action Invocation Framework (SAIF).
Written the Apache PIG scripts to process the HDFS data.
Wrote Hive queries for data analysis to meet the business requirements.
Involved in installing Hadoop Ecosystem components.
Involved in HDFS maintenance and loading of structured and unstructured data.
Installed and configured Hadoop, Map Reduce, HDFS.
Used Hive QL to do analysis on the data and identify different correlations.
Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
Installed and configured Pig and also written Pig Latin scripts.
Wrote Map Reduce job using Scala.
Great understanding of REST architecture style and its application to well performing web sites for global usage.
Developer in Big Data team, worked with Hadoop AWS cloud, and its ecosystem.

Environment: Apache Hadoop, HDFS, Cloudera Manager, CentOS, Java, Map Reduce, Eclipse, Hive, PIG, Sqoop, Oozie and SQL, Scala, Terra form and cloud formation, Hadoop AWS,SSIS,SSRS,SSAS

Confidential - Pittsburgh

Hadoop Developer

Responsibilities:

Worked on a live 60 nodes Hadoop cluster running CDH.
Responsible for building scalable distributed data solutions using Hadoop.
Used Apache Flume to ingest log data from multiple resources directly into HBase and HDFS.
Extracted the data from Oracle into HDFS using Sqoop
Experience in Hive-HBase integration by defining external tables in hive and pointing the Hbase as data store for better performance and lower I/O.
Created and worked Sqoop jobs with incremental load to populate Hive External Tables.
Supported Map Reduce Programs to run on cluster.
Developed Simple to complex Map reduce Jobs using Hive and Pig.
Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms snappy, Lzip.
Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Extensively used Pig for data cleansing.
Created partitioned tables in Hive.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in Map Reduce way.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Load and transform large sets of structured, semi structured and unstructured data.
Developed Oozie workflow for scheduling and orchestrating the ETL Process.
Good experience in monitoring and managing the Hadoop cluster using Cloudera Manager.

Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, AWS, Flume, Hbase, Zookeeper, Oozie workflow, Spark Scala, SQL, Sqoop, Java (Jdk 1.6), Eclipse

Confidential

Java/Hadoop Developer

Responsibilities:

Installed Name node, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node using Cloudera.
Installed and configured Horton works Ambari for easy management of existing Hadoop cluster, Installed and Configured HDP.
Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
Provided Hadoop, OS, Hardware optimizations.
Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
Installed and configured Hadoop components Hdfs, Hive, HBase.
Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
Cluster maintenance as well as creation and removal of nodes.
Monitor Hadoop cluster connectivity and security.
Manage and review Hadoop log files.
Configured the cluster to achieve the optimal results by fine tuning the cluster.
Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
Designed the shell script for backing up of important metadata and rotating the logs on a monthly basis.
Implemented open source monitoring tool GANGLIA for monitoring the various services across the cluster.
Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
Provided the necessary support to the ETL team when required.
Integrated Nagios in the Hadoop cluster for alerts.

Environment: Java, J2EE, JDBC, Linux, Hdfs, Map Reduce, Kdc, Nagios, Ganglia, Oozie, Sqoop, Cloudera Manager.

Confidential

Java Developer

Responsibilities:

Technical responsibilities included high level architecture and rapid development.
Design architecture following J2EE MVC framework.
Developed interfaces using HTML, JSP pages and Struts -Presentation View.
Involved in designing & developing web-services using SOAP and WSDL.
Developed and implemented Servlets running under JBoss.
Used J2EE design Patterns for the Middle Tier development.
Used J2EE design patterns and Data Access Object (DAO) for the business tier and integration Tier layer of the project.
Created UML class diagrams that depict the code's design and its compliance with the functional requirements.
Developed various EJBs for handling business logic and data manipulations from database.
Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
Implemented CMP entity beans for persistence of business logic implementation.
Development of database interaction code to JDBC API making extensive use of SQL Query Statements and advanced prepared statement.
Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
Inspection/Review of quality deliverables such as Design Documents.
Involved in creation running of Test Cases for JUnit Testing.
Experience in implementing Web Services using SOAP, REST and XML/HTTP technologies.
Used Log4J to print the logging, debugging, warning, info on the server console.
Wrote SQL Scripts, Stored procedures and SQL Loader to load reference data.

Environment: Java, J2EE, Spring, JSP, Hibernate, Java Script, CSS, JDBC, IntelliJ, LDAP, REST, Active Directory, SAML, Web Services, Microsoft SQL Server, HTML.

We provide IT Staff Augmentation Services!

Java Developer Resume

Raleigh, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship