We provide IT Staff Augmentation Services!

Hadoop / Big Data Developer Resume

0/5 (Submit Your Rating)

San Bruno, CaliforniA

SUMMARY

  • 8 years of IT experience in Architecture, Analysis, design, development, implementation, maintenance and support with experience in developing strategic methods for deploying big data technologies to efficiently solve Big Data processing requirement.
  • Over 5 years of experience in Hadoop’s ecosystem implementation and development of Big Data applications.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom MapReduce programs in Java and Python.
  • Having profound knowledge in Java,Python,Shell scripting and scheduling automated tasks.
  • Hands on experience on working with Hadoop Database, HBASE.
  • Having experience in developing Storm topologies for real - time computation.
  • Working knowledge of Spark features including Core Spark, Spark SQL, Spark Streaming.
  • Experienced in Core Java and object oriented design with strong understanding of Collections, Multithreading and Exception handling.
  • Knowledge of ETL methods for data extraction, transformation and loading in corporate-wide ETL Solutions and Data warehouse tools for reporting and data analysis
  • Knowledge of ETL methods for data extraction, transformation and loading in corporate-wide ETL Solutions and Data warehouse tools for reporting and data analysis
  • Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills

TECHNICAL SKILLS

Big Data Skills: MapReduce,Hive,Pig,Hbase,Cassandra,Storm,Spark,Kafka,Redis,Hbase, Flume,Sqoop

Languages: Java,Scala,PL/SQL,Python,Shell scripting

Databases: Oracle 9i/10g/11g, SQL Server, MySQL, Sybase IQ

Operating Systems: Linux, Unix, Windows

PROFESSIONAL EXPERIENCE

Confidential, San Bruno, California

Hadoop / Big Data Developer

Responsibilities:

  • Involved in analyzing data to extract targeted customers required for specific campaigns using Hive based on transactions, user events like clicks/opens, browse information.
  • Developed Map-Reduce programs in both Python and Java for processing the extracted customer data.
  • Developed MR jobs for bulk insertion of Confidential ’s customer and item data from files to CASSANDRA.
  • Created shell scripts for automating the process of extracting and loading targeted customers into Hive tables on daily basis to whom various email campaigns are sent.
  • Worked with No SQL database like HBASE
  • Worked on various email campaigns like Back-In-Stock, Price-Drop, Post-Browse, Customer Ratings and Reviews, Shopping Cart Abandon etc.
  • Developed and deployed Hive UDF’s written in Java for encrypting customer-id’s, creating item-image-URL’s etc.
  • Worked on StrongView tool for scheduling and monitoring Batch email campaigns.
  • Extracted StrongView logs from servers using Flume and extracted information like open/click info of customers and loaded into Hive tables. Created reports for getting counts of emails sends, opens, clicks.
  • Created reports for various campaigns open/click info of customers.
  • Written Shell scripts for automation of Hive query processing.
  • Scheduled Map-Reduce and Hive workflows using Oozie .
  • Developed HTML templates for various trigger campaigns.
  • Analyzed transaction data and extracted category wise best-selling items info which is used by marketing team to come up with ideas for new campaigns.
  • Developed complex Hive queries using Joins and automated these jobs using Shell scripts.
  • Monitored and debugged Map-Reduce jobs using the Job-tracker administration page.
  • Developed Storm Topologies for real time email campaigns where Kafka is used as source for getting customer’s website activity information and storing data into HBASE.
  • Involved in migrating existing Hive jobs to Spark SQL environment.
  • Used Spark Streaming API for consuming data from Kafka source and processed data with core Spark functions written in Scala and then stored resultant data in HBASE table which is later used for generating reports
  • Developed data pipeline to ingest data from Kafka source into HDFS as sink using Flume which is used for analysis.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Used Sqoop data from HDFS to MySQL database for generating reports using Tableau
  • Developed RESTful webservices for providing metadata information required for the campaigns.
  • Have working experience with Cloudera Manager
  • Used Maven as a build tool and GitHub as a code repository
  • Configured Jenkins for continuous integration of the application.

Environment: MapReduce,Hive,Pig,Python,HBase,Shellscripting,Storm,Scala,SparkStreaming,SparkSQL,Redis,Oozie,Kafka,Flume,REST webservices.

Confidential, Omaha, Nebraska

Hadoop developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and SQOOP.
  • Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents.
  • Extensively involved in Design phase and delivered Design documents.
  • Involved in Testing and coordination with business in User testing.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Experienced in defining job flows.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked on document oriented database MongoDB
  • Experienced in managing and reviewing the Hadoop log files.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Load and Transform large sets of structured and semi structured data.
  • Responsible to manage data coming from different sources.
  • Involved in creating Hive Tables, loading data and writing Hive queries.
  • Created Data model for Hive tables.
  • Involved in Unit testing and delivered Unit test plans and results documents.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Impala
  • Worked on Oozie workflow engine for job scheduling.
  • Created Hive tables, partitions and loaded the data to analyze using HiveQl queries
  • Loading the data to HBASE using bulk load and HBASE API
  • Developed Map-Reduce programs using Java to perform various transformations, cleaning and scrubbing tasks
  • Integrated Tableau with Hadoop data source for building dashboard to provide various insights on sales of the organization

Environment: Hive,HBase,Map-Reduce,Pig,Sqoop,Oozie

Confidential, Houston,TX

Hadoop Developer

Responsibilities:

  • Developed various Hive scripts (HQL) for processing the data from various Hive tables.
  • Transferred data from the traditional Oracle database to the newly adapted Hadoop environment i.e, used SQOOP to transfer data from Oracle to HDFS (Hadoop Distributed File System)
  • Developed custom Hive User Defined Functions
  • Involved in writing Map-Reduce programs to process several year’s data of Patients to provide several insights of the historical data to the Data Analysts.
  • Used PIG as ETL tool to process the data related to patient present in HDFS to gather insurance claims data
  • Used Cloudera manager.
  • Developed Python scripts to process and move patients data from HDFS to HBASE which is later used for quick lookup of the insurance related information
  • Written various Shell scripts to run the hive queries and scheduled to run on daily basis.
  • Developed various pipelines to process and move data from Oracle to HDFS and vice versa.
  • Design and implement map reduce jobs to support distributed processing using Java,Python
  • Developed and maintain several batch jobs to run automatically depending on business requirements
  • Performed unit testing and internal deployments for all the deliverables to the Business
  • Written Oozie workflow and scheduling for the jobs automation
  • Transformed raw data from several data sources into baseline data by developing Pig scripts and loaded the data into HBase tables.
  • Involved in Configuring Hadoop components like Hive, Pig, HBase, Sqoop, Oozie in the client environment

Environment: Java,Python,Shell scripting,Map-Reduce,Hive,Pig,Sqoop,HBase,Oozie

Confidential, Buffalo, New York

Hadoop Developer

Responsibilities:

  • To configure and manage Hadoop Components such as Pig, Hive, Sqoop.
  • Involved in loading data from Linux file system to HDFS.
  • Used Flume to load unstructured and semi structured data from various sources such as websites and streaming data to cluster.
  • Experience on scheduling Hive, Pig and Sqoop scripts through Cron jobs.
  • Implemented UDFs for providing custom Pig and hive capabilities.
  • Worked on designing NoSQL Schemas on Hbase.
  • Performed Filesystem management and monitoring on Hadoop log files.
  • Utilized Oozie workflow to run Pig and Hive jobs
  • Developed customized classes for serialization and Deserialization in Hadoop
  • Performed optimization of MapReduce for effective usage of HDFS by compression techniques.
  • Developed Shell to automate and provide Control flow to Pig scripts.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Responsible for managing data coming from different sources.
  • Responsible for generating summary reports using Hive, and used Sqoop to export these results into RDBMS.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
  • Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data

Environment: Hadoop,Hive,Pig,Map-Reduce,Sqoop,Flume,Oracle

Confidential

Java Programmer

Responsibilities:

  • The application involved tracking invoices, raw materials and finished products.
  • Gathered user requirements and specifications.
  • Developed the entire application on Eclipse IDE.
  • Developed and programmed the required classes in Java to support the User account module.
  • Used HTML, JSP and JavaScript for designing the front end user interface.
  • Implemented error checking/validation on the Java Server Pages using JavaScript.
  • Developed Servlets to handle the requests, perform server side validation and generate result for user.
  • Used JDBC interface to connect to database.
  • Used SQL to access data from Microsoft SQL Server database.
  • Performed User Acceptance Test.
  • Deployed and tested the web application on WebLogic application server.

Environment: Java, Servlet, JSP, Javascript, HTML, JDBC, Microsoft SQL server

We'd love your feedback!