We provide IT Staff Augmentation Services!

Hadoop Developer Resume

3.00/5 (Submit Your Rating)

Sunnyvale, CaliforniA

SUMMARY:

  • Software Engineer with 6+ Years of overall IT experience along with 4+ Years of industrial experience in working with Hadoop Ecosystem, and 2+ Years of industrial experience in working with Java J2EE & PHP technologies.
  • Expertise on Streaming tools like Kafka, Spark Streaming to perform real time data analysis on huge volume of live data. 
  • Expertise on Partitions, Bucketing in Hive and designed both Managed and External tables in Hive to optimize performance. 
  • Expertise on analyzing data using HiveQL, Pig Latin and custom MapReduce programs in Java & Scala
  • Hands on experience in creating Pig and Hive UDFs using Java in order to analyze the data efficiently. 
  • Extensive experience in importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Systems (RDBMS) and vice versa.
  • Experienced in working with different scripting technologies like Pig scripting, Python and Shell Scripting such as Bash
  • Experienced with Administration and Data modifications on NoSQL databases like HBase, MongoDB and Cassandra
  • Excellent knowledge on various data interchange and representation formats such as CSV, JSON, AVRO, Parquet, XML.
  • Excellent knowledge on Designing and developing the application using Agile methodology, and Scrum Framework
  • Experienced in installation and configuration Hadoop Tools such as HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Spark, Kafka. 
  • Experienced in working with Hadoop Storage and Analytics framework over AWS cloud using tools like SSH, Putty
  • Experienced with AWS instances in the Amazon cloud for some interim solutions.
  • Experienced with Cassandra cluster in cloud (AWS) environment with scalable nodes as per the business requirement.
  • Experienced in Java development using Hibernates, Servlets, JUnit, JavaScript, JSON and JDBC. 
  • Hands on experience in MAVEN for dependency management and structure of the project, and project build & deployment tool such as Jenkins.
  • Hands on experience in Hortonworks and Cloudera Hadoop environments. 
  • Extensive knowledge on data Acquisition from Data sources like MySQL server, Oracle

TECHNICAL SKILLS:

Big Data Skillset - Frameworks & Environments: Hadoop, Spark, Spark Streaming, Kafka, Hive, Sqoop, Pig, Avro, Parquet, Zookeeper, AWS EC2, AWS S3, AWS EMR, AWS Elasticsearch.

Databases: Cassandra, Oracle 11g/12c, MySQL, HBase.

Web services & Technologies: HTML5, jQuery, CSS3, XML 1.1, PHP.

J2EE Technologies: JDBC, Hibernate framework, Spring framework, Servlet.

WORK EXPERIENCE:

Confidential, Sunnyvale, California

Hadoop Developer

Responsibilities:
  • Performed data cleaning on raw data in Spark before storing into the HDFS as offline data source.
  • Extracted data form Kafka, and convert into Dstream in Spark Streaming, and perform transformation to meet different feature requirements such as finding current trending show with a region.
  • Consumed raw data from Kafka cluster within a hundred of topics to generating each different dataset as the training data of machine learning model.
  • Replaced and complimented exist Spark batch job into Spark Streaming job to enable real time data analysis.
  • Extracted historical data from offline sources to enrich the view information of real time streaming data.
  • Extracted and merge user interaction data from related Kafka topics, and convert it into actionable insight for further analysis.
  • Updated the existing batch and streaming job to adapt the latest value of attribute changes from multiple pipeline updates.
  • Created and update dozens of hives table for the offline metadata storage.
  • Tuned the streaming job with experimenting the micro batch interval to handle peak traffic.
  • Loaded offline video metadata from Hive Database to join with transformed RDD in order to generate required dataset.
  • Developed & tested using Zeppelin, Spark Shell, Eclipse.
  • Committed and deployed using GitHub, Jenkins.

Environment: Spark 1.6.x, Kafka 0.8.2.x, Zookeeper, Hive, Pig, Parquet.

Confidential, Sunnyvale, California

Hadoop Developer

Responsibilities:
  • Implemented Micro batch processing using spark streaming to directly update price, inventory etc. details to indexes.
  • Merged real time data with historical signals data updated at different frequency.
  • Updated the latest value of attribute from multiple pipeline updates.
  • Updated dynamically changing product catalog and other features such as store and online availability
  • Worked on performing transformations & actions on RDDs and Spark streaming data. 
  • Used Spark for interactive queries, processing of streaming data and integration with Cassandra database on huge volume of data.
  • Captured catalog updates in Kafka, processing up to 8,000 events per sec. 
  • Developed using Spark Shell, Eclipse.
  • Committed and deployed using GitHub, Maven, and Jenkins.

Environment: Spark 1.2.x, Spark 1.3.x, Spark 1.4.x Kafka 0.8.1.x, Kafka 0.8.2.x, Hive, Pig, Zookeeper, Cassandra

Confidential, San Jose, California

Hadoop Developer

Responsibilities:
  • Created and Maintained Hive warehouse for analysis of user behavior and transaction pattern.
  • Implemented various Hive queries on data to generate aggregated dataset for further analysis.
  • Configured Sqoop and developed script to extract historical dataset from RDBMS to HDFS on a weekly basis.
  • Experienced in Creating Pig scripts and Pig UDFs to pre-process the data, Enriched original data structure into nested and multivalued data.
  • Developed reusable Pig UDFs on customized loading, storing, filtering, grouping, and joining.
  • Created and modified UDF and UDAF’s for Hive on generate business report.
  • Developed UDFs in Java as and when necessary to use in Pig and HIVE queries.
  • Implemented workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing, analyzing and training the classifier using MapReduce jobs, Pig jobs and Hive Jobs.

Environment: Hadoop 2.x, HDFS, Sqoop, Pig, Hive, Core Java, Linux, Maven, Git.

Confidential, San Mateo, California

Java Developer

Responsibilities:
  • Designed and developed Enterprise Eligibility business objects and domain objects with Object Relational Mapping framework such as Hibernate.
  • Involved in the coding and integration of several business-critical modules of application using Java, Spring, Hibernate and REST web services on application server
  • Developed the Web Application using JAVA/J2EE (spring framework)
  • Developed Servlets and JSPs based on MVC pattern using Spring Framework.
  • Created procedures, functions and written complex SQL queries on the database.
  • Used Log4J to create log files to debug as well as trace application.

Environment: Java EE, spring, servlet, Java 1.6, Java 1.7, Oracle, HTML, CSS, JavaScript, JQuery, Eclipse, Hibernate.

Confidential, Foster City, California

Java Developer

Responsibilities:
  • Participated in regular code reviews for migratory projects from old legacy systems written heavily in C++ to Java.
  • Performed performance/load profiling on GoFundMe services with an open source java based tools
  • Participated in implementing test-plans and test-cases built on highs-leveled and detailed design.
  • Contributed in developing each test plan and test case based on the high-leveled and detailed design.
  • Documented and communed test results.
  • Performed code profiling using an open source tool.
  • Built Java backend, JSP, Servlets and Business classes
  • Set up of stage4, a live production like environment.
  • Contributed in regular status meetings to state any bugs, problems and risks.

Environment: Java EE, spring, servlet, Java 1.6.

We'd love your feedback!