Big Data- Hadoop Developer Resume
Raleigh, NC
PROFESSIONAL SUMMARY:
- 7 Years of experience in IT industry which includes 2 plus years of experience in Big Data technologies and widespread experience of 4 plus years in Java, Database Management Systems and Data warehouse systems.
- Hands on experience in working with Hadoop Ecosystems Including Hive, Pig, HBase, Oozie, Impala Spark and Flume.
- Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and MapReduce programming paradigm.
- Highly capable of processing large sets of Structured, Semi - structured and Unstructured datasets and supporting Big Data applications.
- Good Experience in Hadoop cluster capacity planning and designing Name Node, Secondary Name Node, Data Node, Job Tracker, Task Tracker.
- Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
- Strong experience in writing custom UDFs for Hive and Pig with strong understanding in Pig and Hive analytical functions.
- Experience in importing and exporting data using Sqoop from Relational Database to HDFS and from HDFS to Relational Database.
- Extensively worked on Oozie for workflow management, with separate workflows for each layer like Staging, Transformations and Archive layers.
- Experienced in installing, configuring Hadoop cluster of major Hadoop distributions.
- Extensively worked on NOSQL Database such as HBase, Cassandra and MongoDB.
- Worked on MapReduce programs for parallel processing of data and for custom input formats.
- Extensively worked on Pig for ETL Transformations and optimized Hive Queries.
- Worked on Flume to maintain log data from external source systems to HDFS.
- Developed workflow in Oozie to automate tasks of loading the data in to HDFS and preprocessing with pig and used Zookeeper to coordinate the clusters.
- Deployed, configured and managed Linux servers in VM.
- Experienced in developing of large scale Software Enterprise application in Java environment.
- Strong UNIX Shell Scripting skills.
- Extensive experience in working with databases such as SQLServer, MySQL and writing Stored Procedures, Functions, Joins and Triggers for different Data Models.
- Possess a strong coding experience using Core Java. Expert in developing Strong hands-on experience in Java and J2EE frameworks.
- Experience working with JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, Java Beans, EJB, Servlets
- Web page interfaces using JSP, Java Swings, and HTML scripting languages.
- Excellent understanding on Java beans and Hibernate framework to implement model logic to interact with RDBMS databases.
- Always looking for new challenges that broaden my experience and knowledge, as well as further develop skills that was already acquired.
TECHNICAL SKILLS:
Big Data Ecosystems: HDFS, Hive, Pig, MapReduce, Sqoop, HBase, Zookeeper, Flume, Spark,Kafka Storm and Oozie.
Languages: C, C++, Java, J2EE, Spring, Hibernate, Java Servlets, JDBC, JUnit,Scala, Python and Perl
Web Technologies: HTML, DHTML, XHTML, XML, CSS, Ajax and Java Script
Data Base: MY SQL, Oracle 10g, NOSQL, MongoDB, Microsoft SQL Server, DB2 and Sybase
Operating System: Linux, Unix, Windows, Mac OSX
Web Servers: Apache Tomcat 5.x, BEA Web logic 8.x, IBM Websphere 6.00/5.11IDE Eclipse, Net beans
PROFESSIONAL EXPERIENCE:
Confidential, Raleigh NC
Big Data- Hadoop Developer
Responsibilities:
- Extracted data from RDMS to HDFS using Sqoop.
- Created a Hive aggregator to update the Hive table after running the data profiling job.
- Analyzed large data sets by running hive queries.
- Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way.
- Analyzed the data by performing Hive queries.
- Implemented Partitioning, Dynamic Partitioning and Bucketing in Hive.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Built reusable Hive UDF libraries for business requirements, which enabled users to use these UDF's in Hive Querying.
- Written Hive UDF to sort Structure fields and return complex data type.
- Used ORC and Parquet file formats with Hive and Impala.
- Modeled Hive partitions extensively for data separation and faster data processing and followed Pig and Hive best practices for tuning.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Implemented a script to transmit sys print information from Oracle to Hbase using Sqoop.
- Setting up Identity, Authentication, and Authorization.
- Preprocessing files using Scala-Spark.
- Using Spark to perform interactive exploration of large datasets.
- Using Spark core for log transaction aggregation and analytics.
- Integrated scheduler with Oozie work flows to get data from multiple data sources parallel using fork.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports.
- Involved in loading data from local file system (Linux) to HDFS.
Environment: Hadoop, Hive,Pig, Sqoop, Elastic search, MapReduce, HDFS, Fluma, cloudera, Java, Scala, Oozie, Kafka, Oracle 11g / 10g, PL/SQL, Linux, Unix Shell Scripting
Confidential, Kansas City, MO
Hadoop Developer
Responsibilities:
- Installed and configuredHadoopMap reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in defining job flows.
- Experience in managing and reviewingHadooplog files.
- Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
- Experience in runningHadoopstreaming jobs to process Terabytes of xml format data.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from Unix file system to HDFS.
- Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
- Replaced default Derby metadata storage system for Hive with MySQL system.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF's to preprocess the data for analysis.
- Developed Hive queries for the analysts.
- Involved in loading data from Linux and Unix file system to HDFS.
- Load and transform large data sets of structured, semi structured and unstructured data.
- Worked with various Hadoop file formats, including TextFiles, SequenceFile, RCFile.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
- Developed a custom File System plug in forHadoop.So, it can access files on Data Platform. This plugin allowsHadoopMapReduce programs, Cassandra, Pig and Hive to work unmodified and access files directly.
- Designed and implemented MapReduce based large scale parallel relation learning system.
Environment: Hadoop, Hive, HBase, MapReduce, HDFS, Pig, Cassandra, Java (JDK 1.6),MapR, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Unix Shell Scripting
Confidential, Columbus, OH
Junior Hadoop Developer
Responsibilities:
- Responsible for building a system that ingests Terabytes of data per day onto Hadoop from a variety of data sources providing high storage efficiency and optimized layout for analytics.
- Responsible for converting wide online video and ad impression tracking system, the source of truth for billing, from a legacy stream based architecture to a MapReduce architecture, reducing support effort.
- Used Cloudera Crunch to develop data pipelines that ingests data from multiple data sources and process them.
- Used Sqoop to move the data from relational databases to HDFS.Used Flume tomove the data from web logs onto HDFS.
- Used Pig to apply transformations, cleaning and reduplication of data from raw data sources.
- Used MRUnit for doing unit testing.
- Experienced in managing and reviewingHadoop log files.
- Created adhoc analytical job pipeline using Hive and Hadoop Streaming to compute various metrics and dumped them in Hbase for downstream applications.
Environment: JDK1.6,Red Hat Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Zookeeper, Oozie, Python, Maven, HBase, MRUnit, Oracle.
Confidential, NJ
Java/J2EE Developer
Responsibilities:
- Coded the business methods according to the IBM Rational Rose UML model.
- Extensively used Core Java, Servlets, JSP and XML.
- Used Struts 1.2 in presentation tier.
- Generated the Hibernate XML and Java Mappings for the schemas.
- Used DB2 Database to store the system data.
- Used Rational Application Developer (RAD) as Integrated Development Environment (IDE).
- Used unit testing for all the components using JUnit.
- Used Apache log 4j Logging framework for logging of trace and Auditing.
- Used Asynchronous JavaScript and XML (AJAX) for better and faster interactive Front-End.
- Used IBM Web-Sphere as the Application Server.
Environment: Java 1.6, Servlets, JSP, Struts1.2, IBM Rational Application Developer (RAD) 6, Websphere 6.0, iText, AJAX, Rational Rose, log4j testing.
Confidential
Java/SQL Developer
Responsibilities:
- Created UML class diagrams that depicts the code design and it's compliance with the functional requirements.
- Used J2EE design patterns for the middle tier development.
- Developed EJB's in Web Logic for handling business process, database access and asynchronous messaging.
- Used Java Mail notification mechanism to send confirmation email to customers about scheduled payments.
- Developed Message-Driven beans in collaboration with Java Messaging Service(JMS) to communicate with merchant systems.
- Also involved in writing JSP's/JavaScript and Servlets to generate dynamic web pages and web content.
- Wrote stored procedures and Triggers using PL/SQL.
- Involved in building and parsing XML documents using JAX parse.
- Deployed the application on Tomcat Application Server.
- Experience in implementing Web Services and XML/HTTP technologies.
- Created Unix shell and Perl utilities for testing, data parsing and manipulation.
- Used Log4J for log file generation and maintenance.
- Wrote JUnit test cases for testing.
Environment: Java, JDBC, Servlets, JSP, Struts, Eclipse, Oracle 9i, JavaScript, Log4J, J2EE, JDK6, Java Script, EJB, Web Services, SOAP, WSDL, Application Server, Oracle 10g/11g, MYSQL, Log4j, XML, XPATH, XSD, HTML, TFS, Junit testing, CSS.
Confidential
Programmer analyst/Java Developer
Responsibilities:
- Attended User group Meeting to gather system requirements.
- Analyzed and designed document specifications to design J2EE application.
- Involved in the design and development phase of the application.
- Implemented the business logic using Session Beans in EJB.
- Developed User Interface and end user screens using Java Swing, JSP and Servlet.
- Implemented web services using SOAP.
- Responsible for periodic generation of reports.
- Performed Unit testing of the application using JUnit.
- Carried out necessary validations of each developed screen by writing Triggers, Procedures and Functions available along with the objects, events and methods.
- Designed and developed menus in order to navigate from once screen to another screen.
- Used Hibernate framework with JDBC drivers to connect to the database and manipulate the data.
- Use of Joins, Triggers, Stored Procedures and Functions in order to interact with backend database using SQL.
- Review the changes on the weekly basis and ensure the deliverables to be quality.
- Actively documented the common problems during testing and developing phases and also in production phase.
- Coordinated with other Development teams, System managers and Web master and developed a good working environment.
Environment: Java, J2EE, EJB, JSP, SOAP, Java Script, Servlet, JDBC, SQL, UNIX, JUnit.