Sr. Hadoop Developer Resume
Durham, NC
SUMMARY:
- 8+ years of overall experience in IT Industry which includes experience in Java, Big data technologies and web applications in multi - tiered environment using Java, Hadoop, Hive, HBase, Pig, Sqoop, J2EE (Spring, JSP, Servlets), JDBC, HTML, Java Script(Angular JS).
- 3 years of comprehensive experience in Big Data Analytics.
- Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and MapReduce concepts .
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster along with CDH3&4 clusters.
- Worked on designed and implemented a Cassandra based database and related web service for storing unstructured data.
- Experience on NoSQL databases including HBase, Cassandra.
- Designed and implemented a Cassandra NoSQL based database and associated RESTful web service that persists high-volume user profile data for vertical teams.
- Experience in building large scale highly available Web Applications .Working knowledge of web services and other integration patterns.
- Experience in managing and reviewing Hadoop log files.
- Experience in using Pig, Hive, Scoop and Cloudera Manager.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Hands on experience in RDBMS, and Linux shell scripting
- Extending Hive and Pig core functionality by writing customUDFs.
- Experience in analyzing data using HiveQL, Pig Latin and Map Reduce.
- Developed MapReduce jobs to automate transfer of data from HBase.
- Knowledge in job work-flow scheduling and monitoring tools like oozie and Zookeeper.
- Knowledge of data warehousing and ETL tools like Informatica and Pentaho.
- Experience with Eclipse/ RSA.
- Testing code with JUnit, and SOAP UI.
- Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper, of NoSQL databases such as HBase, Cassandra, and of administrative tasks such as installing Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive and Pig.
- Extensive experience in using MVC architecture,Struts,Hibernate for developing web applications using Java, JSPs, JavaScript, HTML, jQuery, AJAX, XML and JSON .
- Excellent Java development skills using J2EE, Spring, J2SE, Servlets, JUnit, MRUnit, JSP, JDBC.
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
TECHNICAL SKILLS:
Programming Languages: C, C++, Java, Shell Scripting, PL/SQL
J2EE Technologies: : Spring, Servlets, JSP, JDBC, Hibernate.
BigData Ecosystem: HDFS, HBase, MapReduce, Hive, Pig, Sqoop, Impala, Cassandra, Oozie, Zookeeper, Flume.
DBMS: Oracle 11g, SQL Server, MySQL.
Modeling Tools: UML on Rational Rose 4.0.
Web Technologies: HTML, JavaScript, XML, jQuery, Ajax, CSS.
Web Services: : Web Logic, Web Sphere, Apache Cassandra, TomcatIDEs: Eclipse, Netbeans, WinSCP.
Operating systems: Windows, Unix, Linux (Ubuntu), Solaris, Centos.
Version and Source Control: CVS, SVN.
Servers: Apache Tomcat.
Frameworks: MVC, Struts, Log4J, Junit, Maven, ANT, WebServices.
PROFESSIONAL EXPERIENCE:
Confidential, Durham, NC
Sr. Hadoop Developer
Responsibilities:
- Design, deploy, Manage PIVOTALHD nodes for our data platform operations (racking/stacking)
- Involving Data Center Capacity planning and deployment.
- Install and configure PIVOTAL HD-2.0.1. Setting up puppet for centralized configuration management.
- Setting up monitoring tools Ganglia, Nagios for PIVOTALHD monitoring and alerting. Monitoring cluster HBase/zookeeper using these tools Ganglia and Nagios.
- Expertise in PIVOTALHD cluster task like Adding Nodes, Removing Nodes without any effect to running jobs and data.
- Write scripts to automate application deployments and configurations. Monitoring YARN applications. Troubleshoot and resolve PIVOTALHD cluster related system problems.
- Implemented HAWQ to render queries faster than any other Hadoop-based query interface
- Installed, maintained, upgraded and supported Apache and JBoss application servers on Red-Hat Linux systems.
- Implemented test scripts to support test driven development and continuous integration.
- Responsible to manage data coming from different sources.
- Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
- Exported the analyzed data to the relational databases using Sqoop and to generate reports for the BI team.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
- Have deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.
Environment: PivotelHD, MapReduce, HDFS, Hive, Pig, Hue, Oozie, Core Java, Eclipse, Hbase, Flume, Cloudera Manager, Oracle 10g, DB2, IDMS, VSAM, SQL*PLUS, Toad, Putty, Windows NT, UNIX Shell Scripting, PentahoBigdata, YARN, HawQ, SpringXD
Confidential,Omaha, NE
Sr. Hadoop Developer
Responsibilities:
- Developed the application using Struts Framework that leverages classical Model View Controller (MVC) architecture.
- Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax
- Created Business Logic using Servlets, POJO's and deployed them on Web logic server
- Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop Clusters.
- Monitored multiple Hadoop clusters environments using Ganglia.
- Managing and scheduling Jobs on a Hadoop cluster.
- Involved in defining job flows, managing and reviewing log files.
- Monitored workload, job performance and capacity planning using Cloud era Manager.
- Installed Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs.
- Implemented Map Reduce programs on log data to transform into structured way to find user information.
- Responsible for loading and transforming large sets of structured, semi structured and unstructured data.
- Collected the log data from web servers and integrated into HDFS using Flume.
- Responsible to manage data coming from different sources.
- Extracted files from Couch DB and placed into HDFS using Sqoop and pre-process the data for analysis.
- Gained experience with NOSQL database.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
Environment: JDK 1.5, J2EE 1.4, Struts 1.3, JSP, Spring, Servlets 2.5, WebSphere 6.1, HTML, XML, JavaScript, Hadoop, HDFS, Pig, Hive, MapReduce, HBase, Sqoop, Oozie, Ganglia and Flume.
Confidential,CA
Sr. Hadoop Developer
Responsibilities:
- Analysing the Functional Specifications Based On Project Required.
- Worked with technology and business user groups for Hadoop migration strategy.
- Installed & configured multi-node Hadoop Cluster and performed troubleshooting and monitoring of Hadoop Cluster.
- Loaded data from various data sources into Hadoop HDFS/Hive Tables.
- Used Datameer for integration with Hadoop and other sources such as RDBMS (Oracle), SAS, Teradata and Flat files.
- Sqooped data from Teradata, DB2, Oracle to Hadoop and vice-versa.
- Wrote Hive and Pig Scripts to analyze customer satisfaction index, sales patterns etc.
- Extended Hive and Pig core functionality by writing custom UDFs using Java.
- Orchestrated sqoop scripts, pig scripts, hive queries using Oozie workflows.
- Worked on Performance Tuning of Hadoop jobs by applying techniques such as MapSide Joins, Partitioning, Bucketing and using different file formats such as SequenceFile, Parquette, RCFile, MapFile.
- Worked on Data Lake architecture to build a reliable, scalable, analytics platform to meet batch, interactive and on-line analytics requirements.
- Integrated Tableau with Hadoop data source for building dashboard to provide various insights on sales of the organization.
- Worked on Spark in building BI reports using Tableau. Tableau was integrated with Spark using Shark & Spark SQL.
- Worked on loading data from Hadoop Cluster to different storage sources like Hbase, Cassandra.
- Developed Map Reduce programs using Java to perform various transformations, cleaning and scrubbing tasks.
- Participated in daily scrum meetings and iterative development.
Environment: JDK 1.5, J2EE 1.4, Struts 1.3, JSP, Spring, Servlets 2.5, WebSphere 6.1, HTML, XML, JavaScript, Hadoop, HDFS, Pig, Hive, MapReduce, HBase, Sqoop, Oozie, Ganglia and Flume.
Confidential,Tampa,FL
Hadoop Developer
Responsibilities:
- Installed and configured multi-nodes fully distributed Hadoop cluster.
- Involved in installing Hadoop Ecosystem components.
- Responsible to manage data coming from different sources.
- Involved in Hadoop Cluster environment administration that includes adding and removing cluster nodes. Analysed and clustered data using the Mahout.
- Development of Interfaces and Conversions to load the data from legacy system to Oracle base tables using PL/SQL Procedures and developed various packages and functions
- Supported Map Reduce Programs those are running on the cluster.
- Involved in HDFS maintenance and administering it through Hadoop Java API.
- Hands on experience in writing custom UDF’s and also custom input and output formats.
- Configured Fair Scheduler to provide service-level agreements for multiple users of a cluster.
- Maintaining and monitoring clusters. Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Managing nodes on Hadoop cluster connectivity and security.
- Resolved configuration issues with Apache add-on tools.
- Used Pig as ETL tool to do transformations, event joins, filter both traffic and some pre-aggregations before storing the data onto HDFS
- Involved in writing Flume and Hive scripts to extract, transform and load the data into Database
- Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
Environment: Cloudera Hadoop, Linux, HDFS, Hive, Sqoop, Flume, Zookeeper, HBase, SQL
Confidential, Seattle, WA
Java/J2EE Developer
Responsibilities:
- • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- • Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.
- • CSS and JavaScript were used to build rich internet pages.
- • Agile Scrum Methodology been followed for the development process.
- • Designed different design specifications for application development that includes front-end, back-end using design patterns.
- • Developed proto-type test screens in HTML and JavaScript.
- • Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
- • Developed the application by using the Spring MVC framework.
- • Collection framework used to transfer objects between the different layers of the application.
- • Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- • Spring IOC being used to inject the parameter values for the Dynamic parameters.
- • Developed JUnit testing framework for Unit level testing.
- • Actively involved in code review and bug fixing for improving the performance.
- • Documented application for its functionality and its enhanced features.
- • Created connection through JDBC and used JDBC statements to call stored procedures.
Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, JUnit, Apache Tomcat, My SQL Server 2008.
Confidential
System Analyst
Responsibilities:
- Involved in software development life cycle (SDLC) of the project (analysis, design, development, testing and implementation).
- Used MS-Visio for analysis & design flow and monitored the initial applications prototype development for the project.
- Leading a team of 4 developers and point of contact for onshore / offshore communication.
- Designed and developed web pages using JSP, JDBC, Servlets, ASP and ASP.NET.
- Developed modules using core java, C#, VB.Net and VB6.0.
- Used HTML, CSS, XML and JavaScript to design a page.
- Successfully migrated legacy application written in VB6.0 to VB.Net.
- Developed Web Services to get data from the external system in terms of .txt file to load into the database.
- Developed DTS/SSIS packages to load employee details into row Mony tables of the SQL server for further processing.
- Wrote Stored Procedures, functions and complex SQL queries for database operations.
- Used Java Scripts to perform client side validations.
- Worked on performance tuning of queries.
- Developed reports using Crystal Reports reporting tool.
- Have used DataGrid and GridView Controls to display data in a customized format in the ASP.NET web pages.
- Have used LDAP and Active Directory Search Interface (ADSI) to authenticate and authorize user.
- Involved in unit testing and production support of the application.
- Defects were managed through the Remedy
Environment: Java, J2EE, JSP, Servlets, .Net Framework 2.0, ASP.NET, C#, VB.NET, ADO.NET, Oracle9i, SQL Server 2005, T-SQL/PL-SQL, HTML, XML, Web Services, JavaScript, Windows 2000, IIS, Tomcat, Visual Source Safe (VSS), Remedy and Crystal Reports.