We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

4.00/5 (Submit Your Rating)

TX

SUMMARY

  • 5 years of experience in software development life cycle design, development and support of systems application architecture.
  • 4+ Years of Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
  • Experience in working with Hadoop clusters using Cloudera (CDH3) distributions.
  • Good noledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and MapReduce concepts.
  • In depth noledge and hands on experience in installing, configuring, monitoring and integration of Hadoop ecosystem components Hadoop (HDFS, MapReduce, Pig, Hive, Scoop, Flume, Hbase, Oozie).
  • Good experience in Cloudera, Hortonworks & Apache Hadoop distributions.
  • Extensively worked on MRV1 and MRV2 Hadoop architectures.
  • Designing and creating Hive external tables using shared meta - store instead of derby with partitioning, dynamic partitioning and buckets.
  • Exposure on Spark, Kafka and Scala Programming.
  • Expertise in database design, creation and management of schemas, writing Stored Procedures, Functions, DDL, DML, SQL queries & Modeling.
  • Proficient in using RDMS concepts with Oracle, SQL Server and MySQL.
  • Strong experience in database design, writing complex SQL Queries and Stored Procedures.
  • Familiar with Java virtual machine (JVM) and multi-threaded processing.
  • Extensively used ETL methodology for supporting Data Extraction, transformations and load.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java and Python.
  • Experience in writing with Map Reduce programs using Apache Hadoop for working with Big Data.
  • Experience on NoSQL databases including HBase, Cassandra.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Extending Hive and Pig core functionality by writing customUDFs.
  • Experience with Eclipse/ RSA.
  • Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper, of NoSQL databases such as HBase, Cassandra, and of administrative tasks such as installing Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive and Pig.
  • Excellent teamwork and communication skills, research-minded, technically competent and result-oriented with problem solving abilities.

TECHNICAL SKILLS

Programming Languages: Core Java, J2EE, Scala, XML, DB2, CICS, SQL, PL/SQL, HiveQL, Pig Latin

Hadoop Eco System: HDFS YARN, MapReduce, Pig, Hive, Sqoop, Flume,zookeeper, Oozie, Apache Kafka and Kerberos

Hadoop Distributions: Cloudera, Hortonworks

Operating Systems: Linux, Unix, MVS, Windows

Non-Relational Databases: MongoDB, Cassandra,HBase

Relational Databases: DB2 V 9.0, MySQL, Microsoft SQL Server

Scripting Languages: Python, Shell Scripting

Application/Web Servers: Apache Tomcat, JBoss, Websphere, MQ Series, Data power, Web services

Tools: Endeavour, Data Power XI150 Appliance, SoapUI, Jmeter, XML Harness, Labs testing tool

QA Tools: Quality Center

IDE: Intellij,Eclipse,Net Beans

PROFESSIONAL EXPERIENCE

Senior Hadoop Developer

Confidential, TX

Responsibilities:

  • Understanding the existing environment to start up ETL process from High level documentation.
  • Used the data Integration tool Pentaho for designing ETL jobs in the process of building Data warehouses and Data Marts.
  • Participation in Performance tuning in database side, transformations, and jobs level.
  • Created a design dat jobs, transformations and load the data sequentially & parallel for initial and incremental loads.
  • Used Pentaho Data Integration Designer to create ETL transformations
  • Developed ETL transformations dat sourced from a variety of Heterogeneous sources including Text files, Json files. worked with Operational cybersecurity which had a wide variety of data from multiple sources in massive volumes.
  • Importing and exporting data into HDFS from database by using Etl.
  • Written Etl jobs to parse the logs and insert into impala tables to facilitate effective querying on the log data
  • Written Map Reduce code to process and parsing the data from various sources and storing parsed data into tables as parquet format.
  • Involved in creating crontab to run multiple Jobs, which run independently with time and data availability.
  • Involved in developing ETL jobs and automated data management from end to end integration work
  • Developed Map Reduce program for parsing and loading into HDFS information.
  • Worked on Qradar which collects log data from an enterprise,network devices,operating systems,applications and user activities and behaviours.
  • Test visualization and reports for data accuracy and functionality

Environment: Hadoop, Spark,Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Qradar, Cloudera 5.8, Oozie, Impala, Spotfire, Eclipse, Pentaho, Oracle

Senior Hadoop Developer

Confidential, IL

Responsibilities:

  • Developed data pipeline using Kafka, Flume, Sqoop, Pig and Spark to ingest customer behavioural data and financial histories into HDFS for analysis.
  • Developed Turbocow framework for filtering data from raw to enrich with business rules.
  • Framework contains different actions like lookups; replace null with zero and also simple copy for further uses.
  • Involved in sqoop dat to bring data from teradata into hdfs to do lookups with dimensional data.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting. Involved in developing Hive UDFs for the needed functionality dat is not out of the box available from Apache Hive.
  • Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics dat can statistically signify the success of the experiment.
  • Used Intellij to build the application.
  • Hands on experience in hadoop cluster 5.5 and 5.4.
  • Involved in extracting customer’s Big data from various data sources into Hadoop HDFS. dis included data from mainframes, databases and also logs data from servers.
  • Responsible and managed entire Hive warehouse.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java MapReduce, Hive and Sqoop as well as system specific jobs.
  • Created Hive tables, loaded data and wrote Hive queries dat run within the map.
  • Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
  • Involved in streaming data to ingest from Kafka cluster on json format and also teradata tables imported on periodic basis as batch jobs.
  • Taken care of flume agents to ingest the ALS event data stream from Kafka to hdfs as compressed for batch processing with spark and also streams raw data to spark streaming.
  • Used sqoop to import tables and also data from teradata to hdfs periodically.
  • Implemented automatic failover Zookeeper and zookeeper failover controller.
  • Worked on impala performance tuning with different workloads and file formats.

Environment: Hadoop,Spark,Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Kafka, Cloudera 5.5, Oozie, Impala, Tableau, Eclipse, Intellij

Hadoop Developer

Confidential, WI

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioural data and financial histories into HDFS for analysis.
  • Involved in writing MapReduce jobs.
  • Involved in SQOOP, HDFS Put or Copy from Local to ingest data.
  • Used Pig to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
  • Involved in developing Pig UDFs for the needed functionality dat is not out of the box available from Apache Pig.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
  • Involved in developing Hive UDFs for the needed functionality dat is not out of the box available from Apache Hive.
  • Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
  • Computed various metrics using Java MapReduce to calculate metrics dat define user experience, revenue etc.
  • Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS Designed and implemented various metrics dat can statistically signify the success of the experiment.
  • Used Eclipse and ant to build the application.
  • Involved in using SQOOP for importing and exporting data into HDFS and Hive.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.
  • Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, MapReduce) and move the data files within and outside of HDFS.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Oracle 11g, Core Java Cloudera HDFS, Eclipse

Java Developer

Confidential

Responsibilities:

  • Responsible for gathering and analyzing requirements and converting them into technical specifications.
  • Used Rational Rose for creating sequence and class diagrams.
  • Developed presentation layer using JSP, Java, HTML and JavaScript.
  • Participated in the design and development of database schema and Entity-Relationship diagrams of the backend Oracle database tables for the application.
  • Designed and Developed Stored Procedures, Triggers in Oracle to cater the needs for the entire application. Developed complex SQL queries for extracting data from the database.
  • Designed and built SOAP web service interfaces implemented in Java.
  • Used Apache Ant for the build process.

Environment: Java, JDK 1.5, Servlets, Ajax, Oracle 10g, Eclipse, Apache Ant, Web Services (SOAP), Apache Axis, Apache Ant, Web Logic Server, JavaScript, HTML, CSS, XML.

We'd love your feedback!