Sr. Hadoop Developer Resume
Durham, NC
PROFESSIONAL SUMMARY:
- Having 8+ years of IT experience as a Developer, Designer & quality reviewer with cross - platform integration experience using Hadoop, Java, J2EE and SOA.
- Strong experience with Hadoop components: Hive, Pig, HBase, Zookeeper, Sqoop and Flume.
- Experience in Hadoop Distributed File System and Ecosystem (MapReduce, Pig, Hive, Sqoop and HBase)
- Hands on experience in installing, configuring, and using Apache Hadoop ecosystems such as MapReduce, HIVE, PIG, SQOOP, SPARK, FLUME and OOZIE.
- Experience on installing, configuring, and using Hadoop components like Hadoop MapReduce (MR1), YARN (MR2), HDFS, Hive, Pig, Avro, Deflate, Flume and Sqoop.
- Experience working on NoSQL databases including HBase and data access using HIVE.
- Extensive experience in MVC (Model View Controller) architecture, design, development of multi-tier enterprise applications for J2EE platform/SOA using Java, JDBC, Servlets, EJB, Struts, Tag Libraries, Hibernate, and XML.
- Experience with a variety of data formats and protocols such as JSON, and AVRO.
- Hands on experience in dealing with Compression Codecs like Snappy, and Gzip.
- Good working experience-using Sqoop to import data into HDFS from RDBMS.
- Hands on experience on apache and Cloudera Hadoop environments.
- Experienced in importing-exporting data into HDFS format.
- Experienced in handling Hadoop Ecosystem Projects such as Hive, Pig and Sqoop.
- Experienced in developing UDFs for Hive using Java.
- Experienced in using Flume to transfer log data files to Hadoop Distributed File System (HDFS)
- Experience in Multiple Relational Databases like Oracle 10g and NOSQL database HBase.
- Strong understanding of databases like HBase, Mongo DB & Cassandra.
- Hands on experience with Hadoop, HDFS, MapReduce and Hadoop Ecosystem (Pig, Hive, Oozie, Flume and HBase).
- Extensive experience in design, development and support Model View Controller using Struts and Spring Framework.
- Develop reusable solution to maintain proper coding standard across different java project.
- Closely worked with Hadoop Admin and performed various job roles and responsibilities of Hadoop Admin.
- Involved in designing, capacity arrangement, cluster setup, performance fine-tuning, monitoring, and structure planning.
- Experience in installing, administering, and supporting Windows and Linux operating systems in an enterprise environment.
- Ability to work effectively in cross-functional team environments and experience of providing training to business users.
- Effective leadership quality with good skills in strategy, business development, client management and project management
TECHNICAL SKILLS:
Languages/Tools: Java, C, C++, XML, HTML/XHTML, DHTML.
Hadoop: HDFS, MapReduce, Cloudera, HIVE, PIG, HBase, SQOOP, Oozie, Zookeeper, Spark, and Kafka
J2EE Standards: JDBC, JNDI, JMS, Java Mail & XML Deployment Descriptors
Web/Distributed Technologies: J2EE, Servlets, JSP, Struts, Hibernate, EJB, XML, MVC, Struts, Spring.
Operating System: Windows 95/98/NT/2000/XP, MS-DOS, UNIX, multiple flavors of Linux.
Databases / NO SQL: Oracle 10g, MS SQL Server 2000, DB2, MS Access & MySQL. Teradata, Cassandra, Greenplum and MongoDB
App/Web Servers: IBM Websphere 5.1.2/5.0/4.0/3.5 , BEA Web logic 5.1/7.0, Jdeveloper, Apache Tomcat, JBoss.
Messaging & Web Services Technology: SOAP, WSDL, UDDI, XML, SOA, JAX-RPC, IBM WebSphere MQ, JMS.
Testing & Case Tools: JUnit, Log4j, Rational Clear case, CVS, ANT, JBuilder.
Version Control Systems: Github, SVN, CVS
PROFESSIONAL EXPERIENCE:
Confidential, Durham NC
Sr. Hadoop Developer
Responsibilities:
- Worked on analyzing, writing Hadoop MapReduce jobs using JavaAPI, Pig and Hive.
- Implemented MapReduce programs to handle semi/ unstructured data like XML, JSON, Avro data files and sequence files for log files.
- Customized Avro tools used in MapReduce, Pig and Hive for deserialization and to work with Avro ingestion framework.
- Analyze large and critical datasets using Cloudera, HDFS, HBase, MapReduce, Hive, Hive UDF, Pig, Sqoop, Zookeeper, & Spark.
- Writing Sqoop jobs to import/export data from Hadoop
- Involved in loading data from edge node to HDFS using shell scripting.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Customize Flume interceptors to encrypt and mask customer sensitive data as per requirement
- Worked with NoSQL database HBase to create tables and store data.
- Developed custom aggregate functions using Spark SQL and performed interactive querying.
- Used Pig to store the data into HBase.
- Hands-on experience in developing capabilities in Python using Spark framework.
- Creating Hive tables, dynamic partitions, buckets for sampling, and working on them using HiveQL
- Stored the data in tabular formats using Hive tables and Hive SerDe's.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
- Used Kafka as a streaming tool to load the data on Hadoop File System and move the same data to NoSQL databases.
- Implemented a script to transmit sysprint information from Oracle to HBase using Sqoop.
- Implemented test scripts to support test driven development and continuous integration.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Hands on experience in running Webpack tasks to build projects properly using Node.js
- Hands on experience on Hadoop Administration by working on configuration management for the nodes, and testing and benchmarking the new nodes.
- Involved in cluster management via Ambari/Cloudera Manager and worked on Cluster Performance Drills.
- Experience in handling backups to the metadata of the cluster and other Eco-system metadata.
- Involved in standard System Admin work like creating new users in Hadoop, handling permissions and performing upgrades.
- Also involved in solving Day to Day Cluster issues like finding out which jobs are taking more time, if users say that jobs are stuck to find out the reason.
- Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
- Worked in agile environment and participated in daily scrum meetings.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Involved in installing Kafka on Hadoop Cluster and configure producer and consumer coding part to establish connection from twitter source to HDFS with popular hash tags.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Environment: Hadoop, HDFS, Pig, Hive, Sqoop, Flume, Kafka, Spark, MapReduce, Cloudera, Avro, Snappy, Zookeeper, NoSQL, HBase, Shell Scripting, Ubuntu, Linux Red Hat.
Confidential, New York NY
Sr. Hadoop Developer
Responsibilities:
- Defined, designed and developed Java applications, specially using Hadoop Map/Reduce by leveraging frameworks such as Cascading and Hive.
- Developed workflow using Oozie for running Map Reduce jobs and Hive Queries.
- Worked on loading log data directly into HDFS using Flume.
- Worked on Cloudera to analyze data present on top of HDFS
- Responsible for managing data from multiple sources.
- Load data from various data sources into HDFS using Flume.
- This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented MapReduce-based large-scale parallel relation-learning system
- Successfully loaded files to Hive and HDFS from Mongo DB Solar.
- Familiarity with a NoSQL database such as MongoDb Solar.
- Successfully loaded files to Hive and HDFS from Mongo DB Solar.
- Extracted files from MySQL through Sqoop and placed in HDFS and processed.
- Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
- Built reusable Hive UDF libraries for business requirements, which enabled users to use these UDF's in Hive Querying.
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Created HBase tables to store various data formats of PII data coming from different portfolios.
- Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files into Hadoop
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
- Hands on experience in creating use cases based on business and user requirements to develop system functions.
- Prepare Developer (Unit) Test cases and execute developer testing.
- Developed unit test cases for Hadoop MapReduce jobs with JUnit.
- Involved in loading data from LINUX file system to HDFS.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Worked on processing unstructured data using Pig and Hive.
- Supported MapReduce Programs those are running on the cluster.
Environment: Hadoop, HDFS, Pig, Hive, Sqoop, Flume, HBase, Shell Scripting, Maven, Hudson/Jenkins, Ubuntu, Linux Red Hat, Mongo DB.
Confidential
Hadoop Developer
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
- Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce.
- Developed data pipeline using Flume, Sqoop to ingest customer behavioral data and purchase histories into HDFS for analysis.
- Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
- Used Pig to perform data validation on the data ingested using sqoop and flume and the cleansed data set is pushed into HBase.
- Participated in development/implementation of Cloudera Hadoop environment.
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
- Designed and implemented MapReduce-based large-scale parallel relation-learning system.
- Worked with Zookeeper, Oozie, AppWorx and Data Pipeline Operational Services for coordinating the cluster and scheduling workflows.
- Designed and built the Reporting Application, which uses the Spark SQL to fetch and generate reports on HBase table data.
- Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase.
- Responsible for creating Hive tables, loading the structured data resulted from MapReduce jobs into the tables and writing hive queries to further analyze the logs to identify issues and behavioral patterns.
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
- Developed Hive queries and Pig scripts to analyze large datasets.
- Involved in importing and exporting the data from RDBMS to HDFS and vice versa using Sqoop.
- Involved in generating the Adhoc reports using Pig and Hive queries.
- Used Hive to analyze data ingested into HBase by using Hive-HBase integration and compute various metrics for reporting on the dashboard.
- Developed job flows in Oozie to automate the workflow for pig and hive jobs.
- Loaded the aggregated data onto Oracle from Hadoop environment using Sqoop for reporting on the dashboard.
Environment: RedHat Linux, HDFS, Map-Reduce, Hive, Java JDK1.6, Pig, Sqoop, Flume, Zookeeper, Oozie, Oracle, HBase.
Confidential
Java Developer
Responsibilities:
- As part of the lifecycle development prepared class model, sequence model and flow diagrams by analyzing Use cases using Rational Tools.
- Reviewing and analyzing data model for developing the Presentation layer and Value Objects.
- Involved in developing Database access components using Spring DAO integrated with Hibernate for accessing the data.
- Responsible for technical and application architecture for the enterprise business management software.
- The technical architecture included integration with Java objects, XML message structures, Java Messaging Service and Tibco RV for pub-sub services.
- Extensive use of Struts Framework for Controller components and view components.
- Involved in writing the exception and validation classes using Struts validation rules.
- Involved in writing the validation rules classes for general server side validations for implementing validation rules as part observer J2EE design pattern.
- Used Hibernate for the persistence of the project.
- Used Spring AOP and Dependency injection during various modules of project.
- Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
- Spring framework was used for dependency injection and was integrate with different frameworks like Struts, Hibernate
- Developed various java objects (POJO) as part of persistence classes for OR mapping.
- Developed web services using SOAP and WSDL with Axis.
- Implemented EJB (Message Driven Beans) in the Service Layer.
- Involved in working with JMS MQ Queues (Producers/Consumers) in Sending and Receiving Asynchronous messages via MDB’s.
- Developed, implemented, and maintained an asynchronous, AJAX based rich client for improved customer experience using XML data and XSLT templates.
- Involved in writing the parsers for parsing and building the XML documents using SAX and DOM Parsers.
- Designed and developed architecture plans, timelines, and system technical and data architecture.
- Developed SQL stored procedures and prepared statements for updating and accessing data from database.
- Used JBoss for deploying various components of application and MAVEN as build tool and developed build file for compiling the code of creating WAR files.
- Used CVS for version control.
- Performed Unit testing and rigorous integration testing of the whole application.
Environment: Java, J2EE, EJB, JMS, Strut, JBoss, Hibernate, JSP, JSTL, AJAX, CVS, JavaScript, HTML, XML, MAVEN, SQL, Oracle, SOA, SAX and DOM Parser, Web Services (SOAP, WSDL), Spring, Windows.
Confidential
Java Developer
Responsibilities:
- Involved in design and development phases of Software Development Life Cycle (SDLC)
- Involved in designing UML Use case diagrams, Class diagrams, and Sequence diagrams using Rational Rose.
- Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
- Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
- Implemented Model View Controller (MVC) architecture using Jakarta Struts frameworks at presentation tier.
- Developed a Dojo based front end including forms and controls and programmed event handling.
- Implemented SOA architecture with web services using JAX-RS (REST) and JAX-WS (SOAP)
- Developed various Enterprise Java Bean components to fulfill the business functionality.
- Created Action Classes which route submittals to appropriate EJB components and render retrieved information.
- Validated all forms using Struts validation framework and implemented Tiles framework in the presentation layer.
- Used Core java and object oriented concepts.
- Extensively used Hibernate in data access layer to access and update information in the database.
- Used Spring Framework for Dependency injection and integrated it with the Struts Framework and Hibernate.
- Used JDBC to connect to backend databases, Oracle and SQL Server 2005.
- Proficient in writing SQL queries, stored procedures for multiple databases, Oracle and SQL Server 2005.
- Wrote Stored Procedures using PL/SQL. Performed query optimization to achieve faster indexing and making the system more scalable.
- Deployed application on windows using IBM Web Sphere Application Server.
- Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report.
- Used Web Services - WSDL and REST for getting credit card information from third party and used SAX and DOM XML parsers for data retrieval.
- Implemented SOA architecture with web services using Web Services like JAX-WS.
- Used ANT scripts to build the application and deployed on Web Sphere Application Server
Environment: Core Java, J2EE, Oracle, SQL Server, JSP, Struts, Spring, JDK, Hibernate, JavaScript, HTML, CSS, AJAX, JUnit, Log4j, Web Services, Windows.