Hadoop Developer Resume Sunnyvale, California - Hire IT People

SUMMARY:

Software Engineer with 6+ Years of overall IT experience along with 4+ Years of industrial experience in working with Hadoop Ecosystem, and 2+ Years of industrial experience in working with Java J2EE & PHP technologies.
Expertise on Streaming tools like Kafka, Spark Streaming to perform real time data analysis on huge volume of live data.
Expertise on Partitions, Bucketing in Hive and designed both Managed and External tables in Hive to optimize performance.
Expertise on analyzing data using HiveQL, Pig Latin and custom MapReduce programs in Java & Scala.
Hands on experience in creating Pig and Hive UDFs using Java in order to analyze the data efficiently.
Extensive experience in importing and exporting data using Sqoop from HDFS/Hive/HBase to Relational Database Systems (RDBMS) and vice versa.
Experienced in working with different scripting technologies like Pig scripting, Python and Shell Scripting such as Bash.
Experienced with Administration and Data modifications on NoSQL databases like HBase, MongoDB and Cassandra.
Excellent knowledge on various data interchange and representation formats such as CSV, JSON, AVRO, Parquet, XML.
Excellent knowledge on Designing and developing the application using Agile methodology, and Scrum Framework.
Experienced in installation and configuration Hadoop Tools such as HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Spark, Kafka.
Experienced in working with Hadoop Storage and Analytics framework over AWS cloud using tools like SSH, Putty
Experienced with AWS instances in the Amazon cloud for some interim solutions.
Experienced with Cassandra cluster in cloud (AWS) environment with scalable nodes as per the business requirement.
Experienced in Java development using Hibernates, Servlets, JUnit, JavaScript, JSON and JDBC.
Hands on experience in MAVEN for dependency management and structure of the project, and project build & deployment tool such as Jenkins.
Hands on experience in Hortonworks and Cloudera Hadoop environments.
Extensive knowledge on data Acquisition from Data sources like MySQL server, Oracle.

TECHNICAL SKILLS:

Big Data Skillset - Frameworks & Environments: Hadoop, Spark, Spark Streaming, Kafka, Hive, Sqoop, Pig, Avro, Parquet, Zookeeper, AWS EC2, AWS S3, AWS EMR, AWS Elasticsearch.

Databases: Cassandra, Oracle 11g/12c, MySQL, HBase.

Web services & Technologies: HTML5, jQuery, CSS3, XML 1.1, PHP.

J2EE Technologies: JDBC, Hibernate framework, Spring framework, Servlet.

WORK EXPERIENCE:

Confidential, Sunnyvale, California

Hadoop Developer

Responsibilities:

Performed data cleaning on raw data in Spark before storing into the HDFS as offline data source.
Extracted data form Kafka, and convert into Dstream in Spark Streaming, and perform transformation to meet different feature requirements such as finding current trending show with a region.
Consumed raw data from Kafka cluster within a hundred of topics to generating each different dataset as the training data of machine learning model.
Replaced and complimented exist Spark batch job into Spark Streaming job to enable real time data analysis.
Extracted historical data from offline sources to enrich the view information of real time streaming data.
Extracted and merge user interaction data from related Kafka topics, and convert it into actionable insight for further analysis.
Updated the existing batch and streaming job to adapt the latest value of attribute changes from multiple pipeline updates.
Created and update dozens of hives table for the offline metadata storage.
Tuned the streaming job with experimenting the micro batch interval to handle peak traffic.
Loaded offline video metadata from Hive Database to join with transformed RDD in order to generate required dataset.
Developed & tested using Zeppelin, Spark Shell, Eclipse.
Committed and deployed using GitHub, Jenkins.

Environment: Spark 1.6.x, Kafka 0.8.2.x, Zookeeper, Hive, Pig, Parquet.

Confidential, Sunnyvale, California

Hadoop Developer

Responsibilities:

Implemented Micro batch processing using spark streaming to directly update price, inventory etc. details to indexes.
Merged real time data with historical signals data updated at different frequency.
Updated the latest value of attribute from multiple pipeline updates.
Updated dynamically changing product catalog and other features such as store and online availability
Worked on performing transformations & actions on RDDs and Spark streaming data.
Used Spark for interactive queries, processing of streaming data and integration with Cassandra database on huge volume of data.
Captured catalog updates in Kafka, processing up to 8,000 events per sec.
Developed using Spark Shell, Eclipse.
Committed and deployed using GitHub, Maven, and Jenkins.

Environment: Spark 1.2.x, Spark 1.3.x, Spark 1.4.x Kafka 0.8.1.x, Kafka 0.8.2.x, Hive, Pig, Zookeeper, Cassandra

Confidential, San Jose, California

Hadoop Developer

Responsibilities:

Created and Maintained Hive warehouse for analysis of user behavior and transaction pattern.
Implemented various Hive queries on data to generate aggregated dataset for further analysis.
Configured Sqoop and developed script to extract historical dataset from RDBMS to HDFS on a weekly basis.
Experienced in Creating Pig scripts and Pig UDFs to pre-process the data, Enriched original data structure into nested and multivalued data.
Developed reusable Pig UDFs on customized loading, storing, filtering, grouping, and joining.
Created and modified UDF and UDAF’s for Hive on generate business report.
Developed UDFs in Java as and when necessary to use in Pig and HIVE queries.
Implemented workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing, analyzing and training the classifier using MapReduce jobs, Pig jobs and Hive Jobs.

Environment: Hadoop 2.x, HDFS, Sqoop, Pig, Hive, Core Java, Linux, Maven, Git.

Confidential, San Mateo, California

Java Developer

Responsibilities:

Designed and developed Enterprise Eligibility business objects and domain objects with Object Relational Mapping framework such as Hibernate.
Involved in the coding and integration of several business-critical modules of application using Java, Spring, Hibernate and REST web services on application server
Developed the Web Application using JAVA/J2EE (spring framework)
Developed Servlets and JSPs based on MVC pattern using Spring Framework.
Created procedures, functions and written complex SQL queries on the database.
Used Log4J to create log files to debug as well as trace application.

Environment: Java EE, spring, servlet, Java 1.6, Java 1.7, Oracle, HTML, CSS, JavaScript, JQuery, Eclipse, Hibernate.

Confidential, Foster City, California

Java Developer

Responsibilities:

Participated in regular code reviews for migratory projects from old legacy systems written heavily in C++ to Java.
Performed performance/load profiling on GoFundMe services with an open source java based tools
Participated in implementing test-plans and test-cases built on highs-leveled and detailed design.
Contributed in developing each test plan and test case based on the high-leveled and detailed design.
Documented and communed test results.
Performed code profiling using an open source tool.
Built Java backend, JSP, Servlets and Business classes
Set up of stage4, a live production like environment.
Contributed in regular status meetings to state any bugs, problems and risks.

Environment: Java EE, spring, servlet, Java 1.6.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Sunnyvale, CaliforniA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship