We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Hoffman Estate, IL

SUMMARY

  • 8 years of proactive IT experience in Analysis, Design, Development, Implementation and Testing of software applications which includes an accomplished 3+ Years of experience in Big Data using Hadoop, Hive, Spark, PIG, Sqoop and MapReduce Programing.
  • Extensively worked upon MapReduce programming model and Hadoop Distributed File Systems (HDFS).
  • Work experience in major components of Hadoop Ecosystem like Flume, Hbase, ZooKeeper, Oozie Hive, Sqoop, PIG and YARN.
  • Exceptional understanding of Hadoop architecture and different components of Hadoop cluster.
  • Leveraged strong Skills in developing applications involving Big Data technologies likeHadoop, Spark, ElasticSearch, MapReduce, Yarn, Flume, Hive, Pig, Kafka, Storm, Sqoop, HBase, Hortonworks, Cloudera, Mahout, Avro and Scala.
  • Developed scripts, numerous batch jobs to schedule various Hadoop programs.
  • Experience in analyzing data using HiveQL, PIG Latin, and custom MapReduce programs in Java.
  • Hands on experience in importing and exporting data from different databases like Oracle, Teradata into HDFS and Hive using Sqoop.
  • Extensive experience in collecting and storing stream data like log data in HDFS using Apache Flume.
  • Extensively used MapReduce Design Patterns to solve complex MapReduce programs.
  • Developed Hive and PIG queries for data analysis to meet the business requirements.
  • Experience in extending Hive and Pig core functionality by writing custom UDFs like UDAFs and UDTFs.
  • Experienced implementing Security mechanism for Hive Data.
  • Experience with Hive Queries Performance Tuning.
  • Strong experience in architecting real time streaming applications and batch style large scale distributed computing applications using tools like Spark Streaming, Spark SQL, Flume, Map reduce, Hive etc.
  • Experienced with improving data cleansing process using Pig Latin operations, transformations and join operations.
  • Extensive knowledge in NoSQL databases like HBase, Cassandra, MongoDB, CouchDB.
  • Experienced with performing CRUD operations using HBase Java Client API and Rest API.
  • Good knowledge on Cassandra, DataStax Enterprise, DataStax OpsCenter and CQL.
  • Experience with Oozie Workflow Engine to automate and parallelize Hadoop Map/Reduce, Hive and PIG jobs.
  • Experienced with processing different file formats like Avro, XML, JSON and Sequence file formats using MapReduce programs.
  • Experience in integrating Apache Kafka with Apache Storm and created Storm data pipelines for real time processing.
  • Excellent Java development skills using J2EE Frameworks like Spring, Hibernate, EJBs and Web Services
  • Implemented SOAP and RESTful Web Services.
  • Exposed to each of the phases of complete Software Development Life Cycle (SDLC).
  • Extensively worked with Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java Design Patterns.
  • Good knowledge in creating PL/SQL Stored Procedures, Packages, Functions, Cursors with Oracle (9i, 10g, 11g), and MySQL server.
  • Worked with Junit and Easymock andMRUnitto implement test cases
  • Good knowledge with versioning tools like Clearcase, Peforce, SubVersion and CVS.
  • Exposed into methodologies like Scrum, Agile and Waterfall.
  • Multi - cultured Team Playerwith complete flexibilityto work independently as well as in a team and have quick grasping capabilities to work with the newly emerging technologies.
  • Motivated high flierwith excellent verbal/written communication skills, admirable presentation capabilities, efficient requirement gathering ability and effectively convey them to other members in the team.

TECHNICAL SKILLS

Big Data Technologies: HDFS, Hive, Map Reduce, Pig, Sqoop, Flume, Oozie, Zookeeper, YARN, Spark, Kafka, Storm.

Scripting Languages: Shell, Python, Scala.

Languages: C, C++, Java, SQL, PL/SQL, PIG Latin, HiveQL, Unix Shell Scripting.

Front End Technologies: HTML, XHTML, CSS, XML, JavaScript, AJAX, Servlets, JSP.

Java Frameworks: MVC, Apache Struts2.0, Spring and Hibernate.

Web Services: SOAP(JAX-WS), WSDL,Apache CXF, Apache Axis SOA, Restful(JAX-RS), JMS.

Application Servers: Apache Tomcat 5.5/6.0/7, WebLogic Server 8x/9x/10x, WebSphere 5.1/6.0, JBoss.

Databases: Oracle 9i/10g/11g, IBM DB2, MySQL, MS SQL Server.

NoSQL Databases: HBase, MongoDB, Cassandra.

IDE: Eclipse, NetBeans.

Operating Systems: Linux, UNIX, Mac, Windows 7/8/10.

Reporting Tools: Tableau, Talend.

PROFESSIONAL EXPERIENCE

Confidential, Wilmington, DE

Sr Hadoop Developer

Responsibilities:

  • Extracted the data from Teradata/MySQL into HDFS using Sqoop export/import.
  • Expertise in using Data organizational designpatterns in MapReduce to convert business data into custom format.
  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Used Partitioning pattern in MapReduce to move records into categories
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
  • Developed Sparkscripts by using Scalashellcommands.
  • Expertise in optimization of MapReduce algorithms using Combiners, Partitioners and Distributed Cache to deliver best results.
  • Optimized MapReduce jobs to use HDFS jobs efficiently by using Gzip, LZO compression techniques.
  • Creating Solr Cloud collections to load charge code data and serve results to end users using SolrLucene Queries with low latency requirements.
  • Implemented Hive generic UDF's to validate business rules that specific to the category.
  • Implemented Performance tuning in Hive Queries like making partition fields as filters, optimize join performance etc.
  • Experience in writing Pigscripts to transform raw data from several data sources into forming baseline data.
  • Exported analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Used UDF’s to implement business logic in Hadoop and responsible to manage data coming from different sources.
  • Automated the process for extraction of data from warehouses and weblogs into HIVE tables by developing workflows and coordinator jobs in Oozie.
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and responded accordingly to any warning or failure conditions using Ganglia.

Environment: Apache Hadoop 1.1.2, MapR, MapReduce, HDFS, Hive, PIG, Kafka, Oozie, Sqoop, Flume, Apache Solr, Java, SQL, Eclipse, Unix Script, MySQL, and Ganglia.

Confidential, Hoffman Estate, IL

Hadoop Developer

Responsibilities:

  • Worked on a live Hadoop production CDH5 cluster with 50 nodes.
  • Worked with highly unstructured and semi structured data of 40 TB in size.
  • Analyzed Hadoop clusters, other analytical tools used in big data like Hive, Pig and databases like HBase.
  • Used Sqoop extensively to ingest data from various source systems into HDFS.
  • Written Hive queries for data analysis to meet the business requirements.
  • Created Hive tables and worked on them using Hive QL.
  • Installed cluster, worked on commissioning & decommissioning of Data node, Name node recovery, capacity planning, and slots configuration.
  • Assisted in managing and reviewing Hadoop log files.
  • Assisted in loading large sets of data (Structure, Semi Structured, Unstructured).
  • Wrote complexPig UDFjobsfor business transformations.
  • Worked with the Data Science team, Teradata team, and business to gather requirements for various data sources like webscrapes, APIs.
  • Involved in creating Hive/Impala tables, and loading and analysing data using Hive queries.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Involved in running Hadoop jobs for processing millions of records and compression techniques.
  • Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Involved in loading data from LINUX file system to HDFS, and wrote shell scripts for productionizing the MAP (Member Analytics Platform) project.
  • Load and transform large sets of structured, and semi structured data.
  • Loaded Golden collection to Apache Solr using morphline code for Business team.
  • Assisted in exporting analysed data to relational databases using Sqoop.
  • Data Modelled for Hbase for large transaction sales data.
  • Proof of Concept on Strom for streaming the data from one of the sources.
  • Proof of Concept in Pentaho for Big Data.
  • Implementation of one of the data source transformations in Spark.
  • Worked in Agile methodology and used Scrum for Development and tracking the project.

Environment: HDFS, CDH5.3.2, Apache Spark 4.1, Kafka, Storm 0.9.5 Cassandra 2.2.0, Hive, Pig, Scala, Java, Sqoop, SQL, Shell scripting.

Confidential, Melville, NY

Hadoop Developer

Responsibilities:

  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Worked with the Data Science team to gather requirements for various data mining projects.
  • Involved in creating Hive tables, and loading and analyzing data using Hive queries.
  • Developed Simple to complex MapReduce Jobs using Hive and Pig.
  • Involved in running Hadoop jobs for processing millions of records of text data.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Involved in loading data from LINUX file system to HDFS.
  • Responsible for managing data from multiple sources.
  • Responsible to manage data coming from different sources.
  • Assisted in exporting analyzed data to relational databases(MySQL) using Sqoop.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
  • Generating tableau reportsand building dashboards.
  • Worked closely with business units to define development estimates according to Agile Methodology.
  • Worked onCDH 4.6: 48 nodes having each node of 3TB storage and 32GB RAM.

Environment: CDH4.6, HDFS, Pig, Hive, MapReduce, Cassandra, LINUX, Tableau 8.2, shell scripting, and Big Data.

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

  • Used Sqoop to import customer information data from MySQL database into HDFS for data processing.
  • Developed a workflow using Oozie to automate the tasks of loading the data into HDFS from analyzing the data.
  • Developed Map Reduce jobs to calculate the total usage of data by commercial routers in different locations.
  • Developed Map reduce programs for data sorting in HDFS.
  • Optimized Hive queries to extract the customer information from HDFS or Hbase.
  • Developed Pig Latin scripts to aggregate the log files of the business clients.
  • Loaded and transformed large sets of structured, semi structured data using Pig Scripts.
  • Involved in loading data from UNIX file system to HDFS.
  • Wrote Map Reduce jobs to generate reports for the number of activities created on a particular day, during a dumped from the multiple sources and the output was written back to HDFS.
  • Designed workflows by scheduling Hive processes for Log file data, which is streamed into HDFS using Flume.
  • Exposure to Amazon Web Services - AWS cloud computing (EMR, EC2 and S3 services).
  • Created the Load Balancer on AWS EC2 for unstable cluster.
  • Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
  • Involved in writing shell scripts in scheduling and automation of tasks.
  • Worked on Hive for further analysis and for generating transforming files from different analytical formats to text files.
  • Involved in ETL, Data Integration and Migration. Imported data using Sqoop to load data from Oracle to HDFS on regular basis.
  • Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
  • Managed and reviewedHadooplog files to identify issues when job fails.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Oozie, Java, Linux Shell Scripting and Big Data.

Confidential, Fayetteville, NY

Hadoop Developer

Responsibilities:

  • Involved in full life-cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Developed MapReduce programs to parse the raw data and store the refined data in tables.
  • Designed and Modified Database tables and used HBASE Queries to insert and fetch data from tables.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
  • Developed algorithms for identifying influencers with in specified social network channels.
  • Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Analyzing data with Hive, Pig and Hadoop Streaming.
  • Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Created Hive tables, loaded data and wrote Hive queries that run within the map.
  • Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
  • Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
  • Experienced in working with Apache Storm.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Performed data mining investigations to find new insights related to customers.
  • Involved in forecast based on the present results and insights derived from data analysis.
  • Involved in collecting the data and identifying data patterns to build trained model using Machine Learning.
  • Developed and generated insights based on brand conversations, which in turn helpful for effectively driving brand awareness, engagement and traffic to social media pages.
  • Developed different formulas for calculating engagement on social media posts.
  • Involved in review technical documentation and provide feedback.
  • Involved in fixing issues arising out of duration testing.

Environment: Java, NLP, HBase, Machine Learning, Hadoop, HDFS, Map Reduce, Hortonworks, Hive, Apache Storm, Sqoop, Flume, Oozie, Apache Kafka, Zookeeper, MySQL, and eclipse.

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in Analysis, Designing, Development and Testing phases of the application.
  • Involved in creation and maintenance of the backend services using Multithread, Spring, Hibernate, SQLServer and Oracle.
  • Developed Web pages using JSPs with Tag libraries, HTML, and JavaScript.
  • Writing J2EE code using Spring, hibernate to upload input CSV files for credit risk data.
  • Implemented Dependency Injection (IOC) feature of spring framework to inject dependency into objects and AOP is used for Logging.
  • Designed and developed persistence layer build on ORM framework and developed it using Hibernate
  • Implemented various Design patterns like Business Delegate, Data Transfer Objects DTO, Service locator, Session Facade and Data Access Objects DAO patterns.
  • Involved in writing SQL, Stored procedure and PL/SQL for back end. Used Views and Functions at the Oracle Database end.
  • Developed various documents within the application using XML by using Eclipse as IDE tool.
  • Developed SOAP requests to interact with billing schedule system.
  • Used Web Services (SOAP & WSDL) to exchange data between Server part and client.
  • Integrating and deploying the application on WebLogic application server using ANT.
  • Developed user interfaces for presenting the expense reports, transaction details using JSP, XML, HTML and Java Script.
  • Used Log4J for logging the application exceptions and debugging statements.
  • Proficient in doing Object Oriented Design using UML-Rational Rose.

Environment: JDK, JSP, Tiles, HTML, Java Script, WebLogic, Eclipse, Spring JDBC/ORM/DI, JSF, JPA/Hibernate, Spring, PL/SQL, Windows, CVS, Log4J, Ant.

Confidential

Java/J2EE Developer

Responsibilities:

  • Participated in the designing of the Web framework using Struts framework as a MVC design paradigm.
  • Involved in entire life cycle development of the application and reviewed and analysed data model for developing the Presentation layer and Value Objects.
  • Used HTML, CSS, XHTML and DHTML in view pages for front-end extensively involved in developing Web interface using JSP, JSP Standard Tag Libraries (JSTL) using Struts Framework.
  • Used Struts&JavaScript for client-side validation and Struts Tag Libraries to develop the JSP pages.
  • Used JSTL in the presentation tier and spring for Dependency Injection and configured Struts Validator Forms, Message Resources, Action Errors, Validation.xml, Validator-rules.xml.
  • Involved in writing the client side scripts using JavaScript and ddeveloped Controller using Action Servlet and Action mapping provided by Struts framework.
  • Wrote Hibernate configuration and mappings xml files for database access and developed various java objects (POJO) as part of persistence classes for OR mapping with databases
  • Developed SQL stored procedures and prepared statements for updating and accessing data from database.
  • Development carried out under Eclipse Integrated Development Environment (IDE) and used Clearcase Version Control for Project Configuration Management.

Environment: J2EE, Hibernate, Struts 1.2, Spring 2.5, EJB, JSP, JSTL, Servlets, Apache Axis 1.2, JavaScript, HTML, XML, JUnit, Eclipse, TOAD, Apache Tomcat, Clearcase, Oracle9i.

Confidential

Java/J2EE Developer

Responsibilities:

  • Developed web components using JSP, Servlets and JDBC.
  • Designed tables and indexes.
  • Designed, Implemented, Tested and Deployed Enterprise Java Beans both Session and Entity using WebLogic as Application Server.
  • Developed stored procedures, packages and database triggers to enforce data integrity. Performed data analysis and created crystal reports for user requirements.
  • Provided quick turn around and resolving issues within the SLA.
  • Implemented the presentation layer with HTML, XHTML and JavaScript.
  • Used EJBs to develop business logic and coded reusable components in Java Beans.
  • Development of database interaction code to JDBC API making extensive use of SQL
  • Query Statements and advanced Prepared Statements.
  • Used connection pooling for best optimization using JDBC interface.
  • Used EJB entity and session beans to implement business logic and session handling and transactions. Developed user-interface using JSP, Servlets, and JavaScript.
  • Wrote complex SQL queries and stored procedures.
  • Actively involved in the system testing.
  • Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product.
  • Involved in development, and Testing, phases of the project by following Agile methodology.

Environment: Windows NT 2000/2003, XP, and Windows 7/ 8 C, Java,UNIX, and SQL using TOAD,Finacle Core banking, CRM 10209, Microsoft Office Suit, Microsoft project.

We'd love your feedback!