We provide IT Staff Augmentation Services!

Java Developer Resume

0/5 (Submit Your Rating)

Raleigh, NC

SUMMARY

  • Over 9+ years of experience spread across Hadoop, RDBMS and ETL, that includes extensive experience in HDFS, Pig, Hive, Map Reduce, Java, Sqoop, python, Redmis, KDB+/Q, Teradata, Sybase and Ab Initio.
  • Having 4+ experience on Big Data Analytics with hands on experience in writing Map Reduce jobs on Hadoop Ecosystem including Hive and Pig.
  • In depth understanding/knowledge of Hadoop Architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
  • Hands - on experience in writing Pig Latin scripts, working with grunt shells and job scheduling with Oozie.
  • Experience in analyzing data using Hive QL, Pig Latin, and custom Map Reduce programs in Java.
  • Good Experience on SDLC (Software Development Life cycle).
  • Experience in installing and maintaining Cassandra by configuring the cassadra.yaml file as per the requirement and performed reads and writes using Java JDBC connectivity.
  • Hands on experience in Sequence files, RC files, Combiners, Counters, Dynamic Partitions, Bucketing for best practice and performance improvement.
  • Extending HIVE and PIG core functionality by using custom User Defined Function's (UDF),User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.
  • Experience in developing Kafka consumers and producers by extending low level and high level consumer and producer API’s.
  • Involved in upgrading existing MongoDB instances from version 2.4 to version 2.6 by upgrading the security roles and implementing newer features.
  • Expertise in writing Spark Streaming applications using Scala higher order functions.
  • Expertise in writing iterative algorithms using Spark MLlib for machine learning applications in Spark-shell on Scala REPL.
  • Experience in configuring various topologies in storm to ingest and process data on the fly from multiple sources and aggregate into central repository Hadoop.
  • Expertise in Commissioning and Decommissioning of nodes in the clusters, Backup configuration and Recovery from a Name node failure.
  • Assisted in Cluster maintenance, Cluster Monitoring and Troubleshooting, Managing and Reviewing data backups and log files.
  • Worked on disaster management with Hadoop cluster.
  • Experience in Database design, Entity relationships, Database analysis, Programming SQL, Stored procedure’s PL/ SQL, Packages and Triggers in Oracle and SQL Server on Windows and LINUX.
  • Good knowledge in tuning the performance of SQL queries and ETL process.
  • Also, experienced in working with tools like TOAD, SQL Server Management studio and SQL plus for development and customization.
  • Excellent working knowledge of HBase and data pre-processing using FLUME.

TECHNICAL SKILLS

Big data/Hadoop Ecosystem: HDFS, Map Reduce, Crunch, HIVE, PIG, Sqoop, Flume, Oozie, Spark, Kafka, Redmis. Storm, HawQ and Avro

NoSql Databases: MongoDB, Cassandra, HBase

Java / J2EE Technologies: JSE, Servlets, JSP, JDBC, XML, AJAX, SOAP, WSDL

Programming Languages: C, C++, Java, Scala, SQL, PL/SQL, Linux shell scripts, python.

Database: Oracle 11g/10g, DB2, Vertica, MySQL, Teradata

Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX, SOAP, KDB+/Q

Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2.

Tools: Used: Eclipse, IntelliJ, GIT, Putty, Winscp

Monitoring and Reporting tools: Ganglia, Nagios, Custom Shell scripts.

ETL Tools: Informatica, SAP Business Objects, Pentaho.

Testing: Hadoop MRUNIT Testing, Hive Testing, Quality Center (QC)3

Operating System: Ubuntu (Linux), Win 95/98/2000/XP, Red Hat

PROFESSIONAL EXPERIENCE

Confidential - Auburn Hills, MI

Sr. Hadoop Developer

Responsibilities:

  • Interacted with corresponding area's Subject Matter Experts (SME), Business Analysts and Key Business Users in gathering the Analytics / Reports Requirements.
  • Involved in Installing, Configuring Hadoop components using CDH 5.2 Distribution.
  • Involved in cluster setup including addition, Decommissioning and balancing of nodes without any effect to running jobs.
  • Responsible for analyzing large data sets and derive customer usage patterns by developing new Map-Reduce programs.
  • Written Map-Reduce code to parse the data from various sources and storing parsed data into Hbase and Hive.
  • Worked on creating combiners, partitions, and distributed cache to improve the performance of Map-Reduce jobs.
  • Skilled in developing applications in Python language for multiple platforms.
  • Familiarity with process and Python software development.
  • Designed and maintained databases using Python.
  • Tested and implemented applications built using Python.
  • Developed Shell Script to perform data profiling on the ingested data with the help of HIVE Bucketing.
  • Responsible for debug, optimization of Hive scripts and also implementing DE duplication logic in Hive using a rank key function (UDF).
  • Loaded the different types of data from Oracle DB, MySQL to Hive, Hbase, HDFS using Sqoop.
  • Developed Spark scripts by using Scala shell commands as per the requirement.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/Map-Reduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark YARN.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data and Real-time streaming data using Spark with Kafka.
  • Setup up data ingestion pipeline using Spark on CDH 5.2 to profile, collate, cleanse and persist data.
  • Implemented various machine learning techniques like Random Forest, K-Means, Logistic Regression for predictions and pattern identification using Spark-MLlib.
  • Used Hive scripts to compute aggregates and store them on HBase and HDFS for low latency applications.
  • Extensive experience with AngularJS, creating custom directives, decorators, and services to interface with both restful and legacy network services also DOM applications.
  • Extensive experience on modern front-end template frameworks for JavaScript including Bootstrap, JQuery, AngularJS etc.
  • Deployed Puppet for automated management of machine configurations.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Developed workflow in Oozie to automate the tasks of loading data into HDFS and pre-processing with Pig and Hive.
  • Used Pig as ETL tool to do transformations, joins and some pre-aggregations before storing the data into HDFS.
  • Imported all the customer specific personal data to Hadoop using Sqoop component from Massive parallel processing (MPP) engine like Netezza database.
  • Used Impala to read, write and query the Hadoop data in HDFS and Hbase.
  • Experienced in running query using Impala and used BI tools to run ad-hoc queries directly on Hadoop.
  • Worked with BI teams in generating the reports and designing ETL workflows on Tableau.

Environment: JDK1.6,Red Hat Linux, CDH 5.2, Hive, Pig, Cassandra, Sqoop, Flume, Zookeeper, Spark, Scala, Kafka, Spark Sql, Spark Streaming, Spark-MLlib Oozie, Storm, DB2, EC2, Oracle 11g, SQL Server 2008, Hbase, Cloudera Manager, GIT, KDB+/Q.

Confidential - Raleigh, NC

Sr. Hadoop developer

Responsibilities:

  • Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in Java for data
  • Experience in Terra form and cloud formation.
  • Wrote Map Reduce programs.
  • Defining workflow using Oozie framework for automation.
  • Implemented Flume (Multiplexing) to steam data from upstream pipes in to HDFS.
  • Responsible for reviewing Hadoop log files.
  • Loading and transforming large sets of unstructured and semi structured data.
  • Performed data completeness, correctness, data transformation and data quality testing using SQL.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Worked on platforms like SSIS, SSAS, SSRS.
  • Implementation of Hive partition (static and dynamic) and bucketing.
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce and loaded data into HDFS.
  • Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark YARN.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data and Real-time streaming data using Spark with Kafka.
  • Setup up data ingestion pipeline using Spark on CDH 5.2 to profile, collate, cleanse and persist data.
  • Assisted in creation of ETL processes for transformation of data sources from existing RDBMS systems.
  • Upgraded Python 2.3 to Python 2.5 on a RHEL 4 server, this required recompiling mod python to use Python 2.5.
  • This upgrade was necessary because in lined models with UTF-8 characters were causing unexpected errors.
  • Developed profile/log interceptors for the struts action classes using Struts Action Invocation Framework (SAIF).
  • Written the Apache PIG scripts to process the HDFS data.
  • Wrote Hive queries for data analysis to meet the business requirements.
  • Involved in installing Hadoop Ecosystem components.
  • Involved in HDFS maintenance and loading of structured and unstructured data.
  • Installed and configured Hadoop, Map Reduce, HDFS.
  • Used Hive QL to do analysis on the data and identify different correlations.
  • Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
  • Installed and configured Pig and also written Pig Latin scripts.
  • Wrote Map Reduce job using Scala.
  • Great understanding of REST architecture style and its application to well performing web sites for global usage.
  • Developer in Big Data team, worked with Hadoop AWS cloud, and its ecosystem.

Environment: Apache Hadoop, HDFS, Cloudera Manager, CentOS, Java, Map Reduce, Eclipse, Hive, PIG, Sqoop, Oozie and SQL, Scala, Terra form and cloud formation, Hadoop AWS,SSIS,SSRS,SSAS

Confidential - Pittsburgh

Hadoop Developer

Responsibilities:

  • Worked on a live 60 nodes Hadoop cluster running CDH.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Used Apache Flume to ingest log data from multiple resources directly into HBase and HDFS.
  • Extracted the data from Oracle into HDFS using Sqoop
  • Experience in Hive-HBase integration by defining external tables in hive and pointing the Hbase as data store for better performance and lower I/O.
  • Created and worked Sqoop jobs with incremental load to populate Hive External Tables.
  • Supported Map Reduce Programs to run on cluster.
  • Developed Simple to complex Map reduce Jobs using Hive and Pig.
  • Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms snappy, Lzip.
  • Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Pig for data cleansing.
  • Created partitioned tables in Hive.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in Map Reduce way.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Developed Oozie workflow for scheduling and orchestrating the ETL Process.
  • Good experience in monitoring and managing the Hadoop cluster using Cloudera Manager.

Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, AWS, Flume, Hbase, Zookeeper, Oozie workflow, Spark Scala, SQL, Sqoop, Java (Jdk 1.6), Eclipse

Confidential

Java/Hadoop Developer

Responsibilities:

  • Installed Name node, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node using Cloudera.
  • Installed and configured Horton works Ambari for easy management of existing Hadoop cluster, Installed and Configured HDP.
  • Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
  • Provided Hadoop, OS, Hardware optimizations.
  • Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
  • Installed and configured Hadoop components Hdfs, Hive, HBase.
  • Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
  • Cluster maintenance as well as creation and removal of nodes.
  • Monitor Hadoop cluster connectivity and security.
  • Manage and review Hadoop log files.
  • Configured the cluster to achieve the optimal results by fine tuning the cluster.
  • Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
  • Designed the shell script for backing up of important metadata and rotating the logs on a monthly basis.
  • Implemented open source monitoring tool GANGLIA for monitoring the various services across the cluster.
  • Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Provided the necessary support to the ETL team when required.
  • Integrated Nagios in the Hadoop cluster for alerts.

Environment: Java, J2EE, JDBC, Linux, Hdfs, Map Reduce, Kdc, Nagios, Ganglia, Oozie, Sqoop, Cloudera Manager.

Confidential

Java Developer

Responsibilities:

  • Technical responsibilities included high level architecture and rapid development.
  • Design architecture following J2EE MVC framework.
  • Developed interfaces using HTML, JSP pages and Struts -Presentation View.
  • Involved in designing & developing web-services using SOAP and WSDL.
  • Developed and implemented Servlets running under JBoss.
  • Used J2EE design Patterns for the Middle Tier development.
  • Used J2EE design patterns and Data Access Object (DAO) for the business tier and integration Tier layer of the project.
  • Created UML class diagrams that depict the code's design and its compliance with the functional requirements.
  • Developed various EJBs for handling business logic and data manipulations from database.
  • Designed and developed the UI using Struts view component, JSP, HTML, CSS and JavaScript.
  • Implemented CMP entity beans for persistence of business logic implementation.
  • Development of database interaction code to JDBC API making extensive use of SQL Query Statements and advanced prepared statement.
  • Involved in writing Spring Configuration XML files that contains declarations and other dependent objects declaration.
  • Inspection/Review of quality deliverables such as Design Documents.
  • Involved in creation running of Test Cases for JUnit Testing.
  • Experience in implementing Web Services using SOAP, REST and XML/HTTP technologies.
  • Used Log4J to print the logging, debugging, warning, info on the server console.
  • Wrote SQL Scripts, Stored procedures and SQL Loader to load reference data.

Environment: Java, J2EE, Spring, JSP, Hibernate, Java Script, CSS, JDBC, IntelliJ, LDAP, REST, Active Directory, SAML, Web Services, Microsoft SQL Server, HTML.

We'd love your feedback!