We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

0/5 (Submit Your Rating)

Warsaw, IN

SUMMARY

  • Overall 8+ years of experience in design and deployment of Enterprise Application Development, Web Applications, Client - Server Technologies, Web Programming using Java and Big data technologies
  • Possesses 3+ years of rich Hadoop experience in design and development of Big Data applications, which involves Apache Hadoop Map/Reduce, HDFS, Hive, HBase, Pig, Oozie, Scala, Sqoop, kafka, Flume, Tez and Spark.
  • Expertise in developing solutions around NOSQL databases like HBase and Cassandra
  • Experience with all flavor of Hadoop distributions, including Cloudera, Horton works and MapR
  • Excellent understanding of Hadoop architecture Map Reduce MRv1 and Map Reduce MRv2 (YARN)
  • Developed multiple Map Reduce programs to process large volumes of semi/unstructured data files using different Map Reduce design patterns
  • Good knowledge on Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data and Machine Learning Concepts.
  • Strong experience in writing Map Reduce jobs in Java and Pig.
  • Experience with various performance optimizations like using distributed cache for small datasets, partition, bucketing in Hive and Map Side joins when writing Map Reduce jobs
  • Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
  • Experience in writing and testing Map-Reduce pipelines using Apache Crunch.
  • Worked extensively over semi-structured data (fixed length & delimited files), for data sanitation, report generation and standardization
  • End to end experience in designing and deploying data visualizations using Tableau. Extensive experience in Data Analysis.
  • Having hands on experience in writing Map Reduce jobs in Java, Hive, Impala and Pig Latin.
  • Excellent hands on experience in analyzing data using Pig Latin, HQL, HBase and Map Reduce programs in Java.
  • Developed UDF's in Java as and when necessary to use with PIG and HIVE queries
  • Have dealt with Zookeeper an Oozie Operational Services for coordinating the cluster and scheduling workflows.
  • Proficient using Big data ingestion tools like Flume and Sqoop
  • Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop
  • Experience in handling continuous streaming data using Flume and memory channels.
  • Good experience in benchmarking Hadoop cluster.
  • Good knowledge on data analysis with R.
  • Good knowledge on executing Spark SQL queries against data in Hive
  • Experienced in monitoring Hadoop cluster using Cloudera Manager and Web UI.
  • Designed and deployed big data analytics data services platform (Hadoop, Storm, Kafka, etc.)
  • Developed core modules in large cross-platform applications using JAVA, J2EE, Hibernate, JAX-WS Web Services, JMS and EJB.
  • Extensive Experience working on web technologies like HTML, CSS, XML, JSON, JQuery
  • Experienced with build tools Maven, ANT and continuous integrations like Jenkins.
  • Extensive experience in documenting requirements, functional specifications and technical specifications.
  • Extensive experience with SQL, PL/SQL and database concepts.
  • Experience working on Version control tools like SVN and Git revision control systems such as GitHub and JIRA to track issues and crucible for code reviews
  • Strong Database background with Oracle, PL/SQL, Stored Procedures, trigger, SQL Server, MySQL, and DB2.
  • Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.
  • Good Team Player, Strong Interpersonal, Organizational and Communication skills combined with Self-Motivation, Initiative and Project Management Attributes.
  • Holds strong ability to handle multiple priorities and work load and also has ability to understand and adapt to new technologies and environments faster.

TECHNICAL SKILLS

Hadoop Core Services: HDFS, Map Reduce, Spark, YARN

Hadoop Distribution: Horton works, Cloudera, Apache

NO SQL Databases: HBase, Cassandra

Hadoop Data Services: Hive, Pig, Impala,Sqoop, Flume, Sqoop, Kafka

Hadoop Operational Services: Zookeeper, Oozie

Monitoring Tools: Ganglia, Cloudera Manager

Cloud Computing Tools: Amazon AWS

Languages: C, Java, Scala, Python, SQL, PL/SQL, Pig Latin, HiveQL, Unix, Java Script,Shell Scripting

Java & J2EE Technologies: Core Java, Servlets,Hibernate, Spring, Struts, JMS, EJB

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat.

Databases: Oracle, MySQL, Postgress, Teradata

Operating Systems: UNIX, Windows, LINUX

Build Tools: Jenkins, Maven, ANT

Development Tools: Microsoft SQL Studio, Toad, Eclipse, NetBeans

Development Methodologies: Agile/Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Confidential, Warsaw, IN

Sr. Hadoop developer

Responsibilities:

  • Developed simple and complex Map Reduce programs in Java for Data Analysis on different data formats
  • Developed Map Reduce programs that filter bad and un-necessary records and find out unique records based on different criteria.
  • Developed Secondary sorting implementation to get sorted values at reduce side to improve map reduce performance.
  • Implemented custom Data Types, Input Format, Record Reader, Output Format, Record Writer for Map Reduce computationsto handle custom business requirements.
  • Implemented Map Reduce programs to classified data organizations into different classifieds based on different type of records.
  • Worked on Sequence files, RC files, Map side joins, bucketing, partitioning for hive performance enhancement and storage improvement.
  • Implemented Daily Cron jobs that automate parallel tasks of loading the data into HDFS and pre-processing with Pig using Oozie co-coordinator jobs
  • Responsible for performing extensive data validation using Hive
  • Worked with SQOOP import and export functionalities to handle large data set transfer between Oracle database and HDFS.
  • Worked intuning Hive and Pig scriptsto improve performance
  • Involved in submitting and tracking Map Reduce jobs using JobTracker.
  • Involved in creating Oozie workflow and Coordinator jobs to kick off the jobs on time and data availability
  • Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
  • Involved in loading the created HFiles into HBase for faster access of large customer base without taking Performance hit.
  • Implemented Hive Generic UDF's to implement business logic.
  • Coordinated with end users for designing and implementation of analytics solutions for User Based Recommendations using R as per project proposals.
  • Worked on research team that developed Scala, a programming language with full Java interoperability and a strong type system.
  • Improved stability and performance of the Scala plug-in for Eclipse, using product feedback from customers and internal users.
  • Redesigned and implemented Scala REPL (read-evaluate-print-loop) to tightly integrate with other IDE features in Eclipse.
  • Assisted monitoring Hadoop cluster using Ganglia.
  • Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
  • Implemented test scripts to support test driven development and continuous integration.
  • Junit framework was used to perform unit and integration testing.
  • Configured build scripts for multi module projects with Maven and Jenkins CI.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, Oozie, Java, Kafka, Linux, Scala, Maven, Java Scripting, Oracle 11g/10g, SVN, Ganglia

Confidential, Columbus, OH

Hadoop Developer

Responsibilities:

  • Installed, configured, and maintained Apache Hadoop clusters for application development and major components of Hadoop Ecosystem: Hive, Pig, HBase, Sqoop, Flume, Oozie and Zookeeper
  • Implemented six nodes CDH4 Hadoop Cluster on CentOS
  • Importing and exporting data into HDFS and Hive from different RDBMS using Sqoop
  • Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie
  • Importing log files using Flume into HDFS and load into Hive tables to query data
  • Monitoring the runningMap Reduceprograms on the cluster.
  • Responsible for loading data from UNIX file systems to HDFS
  • Used HBase-Hive integration, written multiple Hive UDFs for complex queries
  • Involved in writing APIs to ReadHBasetables, cleanse data and write to anotherHBasetable
  • Created multiple Hive tables, implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access
  • Written multiple Map Reduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other compressed file formats
  • Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements
  • Experienced in writing programs using HBase Client API
  • Involved in loading data into HBase using HBase Shell, HBase Client API, Pig and Sqoop
  • Experienced in design, development, tuning and maintenance of NoSQL database
  • Written Map Reduce program in Python with the Hadoop streaming API
  • Developed unit test cases for Hadoop Map Reduce jobs with MRUnit
  • Excellent experience in ETL analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of database
  • Continuously monitored and managed the Hadoop cluster using Cloudera manager and Web UI.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Used Maven as the build tool and SVN for code management.
  • Worked on writing RESTful web services for the application.
  • Implemented testing scripts to support test driven development and continuous integration.

Environment: Hadoop, Map Reduce, HDFS, HBase, Hive, Impala,Pig, Java, SQL, Ganglia, Scoop, Flume, Oozie, Unix, Java, Java Script, Maven, Eclipse

Confidential, Spring field, IL

Java / Hadoop Developer

Responsibilities:

  • Imported Data from Different Relational Data Sources like RDBMS, Teradata to HDFS using Sqoop.
  • Worked on writing transformer/mapping Map-Reduce pipelines using Apache Crunch and Java.
  • Imported Bulk Data into Cassandra file system Using Thrift API.
  • Involved in creatingHive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
  • Perform analytics on Time Series Data exists in Cassandra using Java API
  • Designed and implemented Incremental Imports into Hive tables.
  • Worked in Loading and transforming large sets of structured, semi structured and unstructured data
  • Involvedin collecting, aggregating and moving data from servers to HDFS using Apache Flume
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Experienced in managing and reviewing theHadooplog files.
  • Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Implemented the workflows using Apache Oozie framework to automate tasks
  • Worked with Avro Data Serialization system to work with JSON data formats.
  • Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
  • Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Developed scripts and automated data management from end to end and sync up between all the clusters.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Pig Scripts.

Environment: Hadoop, HDFS, Horton works (HDP 2.1), Map Reduce, Hive, Oozie, Sqoop, Pig, MySQL, Java, Rest API, Maven, MRUnit, Junit.

Confidential, Jacksonville FL

Sr Java Developer

Responsibilities:

  • Designed, developed, maintained, tested, and troubleshoot Java and PL/SQL programs in support of Payroll employees.
  • Developed documentation for new and existing programs, designs specific enhancements to application.
  • Implemented web layer using JSF and Ice faces.
  • Implemented business layer using Spring MVC.
  • Implemented Getting Reports based on start date using HQL.
  • Implemented Session Management using Session Factory in Hibernate.
  • Developed the DO’s and DAO’s using hibernate.
  • Implement SOAP web service to validate zip code using Apache Axis.
  • Wrote complex queries, PL/SQL Stored Procedures, Functions and Packages to implement Business Rules.
  • Wrote PL/SQL program to send EMAIL to a group from backend.
  • Developer scripts to be triggered monthly to give current monthly analysis.
  • Scheduled Jobs to be triggered on a specific day and time.
  • Modified SQL statements to increase the overall performance as a part of basic performance tuning and exception handling.
  • Used Cursors, Arrays, Tables, Bulk collect concepts.
  • Extensively used log4j for logging the log files
  • Performed UNIT testing in all the environments.
  • UsedSubversionas the version control system

Confidential

Java/J2EE developer

Responsibilities:

  • Involved in all the phases of the life cycle of the project from requirements gathering to quality assurance testing.
  • Developed Class diagrams, Sequence diagrams using Rational Rose.
  • Responsible in developing Rich Web Interface modules with Struts tags,JSP, JSTL, CSS, JavaScript, Ajax, GWT.
  • Developed presentation layer using Struts framework, and performed validations using Struts Validator plugin.
  • Created SQL script for the Oracle database
  • Implemented the Business logic using Java Spring Transaction Spring AOP.
  • Implemented persistence layer using Spring JDBC to store and update data in database.
  • Produced web service using WSDL/SOAP standard.
  • Implemented J2EE design patterns like Singleton Pattern with Factory Pattern.
  • Extensively involved in the creation of the Session Beans and MDB, using EJB 3.0.
  • Used Hibernate framework for Persistence layer.
  • Extensively involved in writing Stored Procedures for data retrieval and data storage and updates in Oracle database using Hibernate.
  • Deployed and built the application using Maven.
  • Performed testing using JUnit.
  • Used JIRA to track bugs.
  • Extensively used Log4j for logging throughout the application.
  • Produced a Web service using REST with Jersey implementation for providing customer information.
  • Used SVN for source code versioning and code repository.

Environment: Java (JDK1.5), J2EE, Eclipse, JSP, JavaScript, JSTL, Ajax, GWT, Log4j, CSS, XML, Spring, EJB, MDB, Hibernate, Web Logic, REST, Rational Rose, Junit, Maven, JIRA,SVN.

We'd love your feedback!