We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Long Island, NY

SUMMARY:

  • 8+ years of cradle to grave IT experience in industries like Retail/Insurance/Healthcare performing roles ofHadoop Developer, Data Warehousing and Java Developing which also includes extensive usage of Big Datatools like Hadoop, Hive, Pig, Sqoop, Kafka, Flume, Spark and MapReduce programming.
  • Working experience in MapReduce programming model and Hadoop Distributed File Systems (HDFS).
  • Performed Architecture design, Data modelling and implementation of Big Data platform.
  • Hands on experience in major components of Hadoop Ecosystem like Flume, Hbase, Zookeeper, Oozie,Hive, Sqoop, PIG, Apache Falcon and YARN (MR2).
  • Maintained and optimized AWS infrastructure (EMR EC2, S3, EBS/Provisioned IOPS, AMI, RDS and IAMroles for users/system).
  • Developed scripts and numerous batch process jobs to schedule various Hadoop programs.
  • Experience in Amazon, Cloudera and Hortonworks Hadoop distribution.
  • Worked on importing and exporting data from different databases like Oracle, MySQL, and SQL server intoHDFS using sqoop.
  • Strong experience in collecting and storing streaming data like log data, twitter data into HDFS using ApacheFlume.
  • Knowledge in Talend Big data integration for business demands to work towards Hadoop and NoSQL.
  • Real - Time Data Ingestion using Big Data stack of technologies (STREAMING SPARK).
  • Used Spark streaming to consume topics from distributed messaging source Kafka and periodically pushbatch of data to spark for real time processing.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Experienced in working with MapReduce Design patterns to solve complex MapReduce programs.
  • Build AWS secured solutions by creating VPC with private and public subnets.
  • Involved in creating tables, partitioning, bucketing and creating UDF's in Hive.
  • Implemented join operations and involved in writing data transformation using PIG Latin.
  • Extensive knowledge in NoSQL databases like Hbase and Cassandra.
  • Experienced in using CRUD operations using Hbase Java client API and Rest API.
  • Good knowledge with Oozie Workflow engine to automate and parallelize Hadoop MR, Hive and PIG jobs.
  • Excellent team player with multi-tasking ability, detail oriented, quick learner, self-motivated and performingunder pressure in a rapidly changing environment.

TECHNICAL SKILLS:

  • Big Data Platforms Cloudera, Big Data, Hadoop, Yarn, Map Reduce, PIG, HIVE, Storm, Kafka, Oozie, Impala,Ignite, FLUME, kinesis and SPARK
  • Languages Java, C++, Python
  • Databases Oracle, MySQL, SQL Server,No SQL Databases Hbase, Cassandra, MongoDB, Accumulo, Job Scheduling Framework Auto Sys, Quartz Scheduler
  • Operating Systems Linux, Unix, Windows 7, Windows 8, XP, Windows vista
  • Hadoop Distribution Cloudera, Horton Works, AWS
  • Web Technologies HTML, XHTML, Java Script
  • Data Modelling tools MS Visio, Rational Rose
  • Work Environments Eclipse

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential, Long Island, NY

Responsibilities:

  • Involved in requirement analysis, design, coding and implementation.
  • Designed ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoopand MySQL.
  • Wrote Hive queries to have a consolidated view of the mortgage and retail data.
  • Integrated Apache Storm with Kafka to perform web analytics and to perform click stream data from Kafkato HDFS.
  • Data is loaded back to the Teradata for the BASEL reporting and for the business users to analyze andvisualize the data using Data Meer.
  • Implemented using Cloudera (CDH 4.5) distribution.
  • Used Cloudera manager to monitor the Hadoop eco system.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Analyze current data sources and schema based in use case documentation provided, Develop program andscripts to complete data ingestion into Hadoop cluster.
  • Responsible to manage data coming from different sources
  • Involved in designing the row key in Hbase to store Text and JSON as key values in Hbase table and designedrow key in such a way to get/scan it in a sorted order.
  • Supported Map Reduce Programs those are running on the cluster.
  • Created and maintained technical documentation for launching Hadoop clusters and for executing Hivequeries and Pig Scripts.
  • The above scripts were written for distribution of query for performance test jobs in Amazon Datalake.
  • Involved in Hadoop cluster task like adding and removing nodes without any effect to running jobs and data.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Shading features.
  • Orchestrated hundreds of Sqoop scripts, Pig scripts, Hive queries using Oozie workflow and sub-workflows.
  • Loaded the files from mainframes to Hadoop and files were converted to ASCII format.
  • Developed Pig Latin scripts for replacing the existing home loans legacy process to the Hadoop and the datais fed to retail legacy mainframes systems.

Environment: Hadoop Distributed File System (HDFS), Spark, MapReduce, Hive, Pig, Sqoop, Kafka, SOAP,Web services, Junit, maven and Oozie.

Hadoop Developer

Confidential, New Brunswick, NJ

Responsibilities:

  • Involved in requirement analysis, design, coding and implementation.
  • Responsible for building scalable distributed data solutions using Hadoop Cloudera.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.
  • Experience in supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services(AWS) cloud, performed Export and import of data into s3.
  • Processed data into HDFS by developing solutions and analyzed the data using Map Reduce, PIG, and Hiveto produce summary results from Hadoop to downstream systems.
  • Used Sqoop to import the data from Hadoop Distributed File System (HDFS) to RDBMS.
  • Established custom Map Reduce programs in order to analyze data and used Pig Latin to clean unwanteddata.
  • Streamed AWS log group into Lambda function to create service now incident.
  • Participated in SOLR schema and ingested data into SOLR for data indexing.
  • Extensive experience in designing and implementing Data Flow pipeline from RDBMS to Hadoop.
  • Worked on Horton works sandbox.
  • Worked on S3 buckets on AWS to store Cloud Formation Templates.
  • Worked on AWS to create EC2 instances.
  • Worked on various performance optimizations like using distributed cache for small datasets, partition, Bucketing and Map side joins.
  • Involved in creating Hive tables and applied those HQL on the tables for data validation.
  • Responsible for installation and configuration of Hive, Pig, Hbase and Sqoop on the Hadoop cluster.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data andanalyzed them by running Hive queries and Pig scripts.
  • Used Zookeeper to manage coordination among the clusters.
  • Worked with Impala to pull the data from Hive tables.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time anddata availability.
  • Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backupsand log files.

Environment: HDFS, Hadoop, Pig, Hive, Sqoop, Flume, MapReduce, Oozie, Mongo DB, Java 6/7, Oracle 10g,Sub Version, Toad, UNIX Shell Scripting, SOAP, REST services, Oracle 10g, Agile Methodology, JIRA, AutoSys.

Hadoop Developer

Confidential, Norwalk, CT

Responsibilities:

  • Worked on importing and exporting data between DB2 and HDFS using Sqoop.
  • Used Flume to collect, aggregate and store the web log data from different sources like web servers, mobile and network devices to be pushed into HDFS.
  • Developed MapReduce programs in Java to convert data from JSON format to CSV and TSV formats toperform analytics.
  • Developed Pig Latin scripts for cleansing and analysis of semi-structured data.
  • Experienced in debugging MapReduce jobs and Pig scripts.
  • Used Pig as ETL tool to do transformations, event joins and pre-aggregations before storing the data intoHDFS.
  • Experience in creating Hive tables, loading with data and writing hive queries.
  • Experience in migration of ETL processes from Relational databases to Hive to test the easy datamanipulation.
  • Used Hive to analyze the partitioned and bucketed data to compute various metrics for reporting.
  • Written Hive and Pig UDFs to perform aggregation to support the business use case.
  • Performed MapReduce integration to import large amounts of data into Hbase.
  • Experience with performing CRUD operations using Hbase Java client API.
  • Developed shell scripts to automate MapReduce jobs to process data.

Environment: CDH3, Cloudera Manager, Java, shell, SQL, Hadoop, HDFS, Sqoop, Flume, MapReduce, Pig,Hive, Oracle, MongoDB, Hbase, JDK 1.7, TDD and Agile SCRUM.

Java Developer

Confidential

Responsibilities:

  • Created UML class diagrams that depict the code's design and its compliance with the functionalrequirements.
  • Used J2EE design patterns for the middle tier development.
  • Developed EJB's in Web Logic for handling business process, database access and asynchronousmessaging.
  • Used Java Mail notification mechanism to send confirmation email to customers about scheduled payments.
  • Developed Message-Driven beans in collaboration with Java Messaging Service (JMS) to communicate withmerchant systems.
  • Also involved in writing JSP's/JavaScript and Servlets to generate dynamic web pages and web content.
  • Wrote stored procedures and Triggers using PL/SQL.
  • Involved in building and parsing XML documents using JAX parser.
  • Deployed the application on Tomcat Application Server.
  • Experience in implementing Web Services and XML/HTTP technologies.
  • Created UNIX shell and Perl utilities for testing, data parsing and manipulation.
  • Used Log4J for log file generation and maintenance.
  • Wrote JUnit test cases for testing.

Environment: Java, JDBC, Servlets, JSP, Struts, Eclipse, Oracle 9i, CVS, JavaScript, Log4J, J2EE, JDK6, Java Script, EJB, Web Services, Spring, SOAP, WSDL, Application Server, Oracle 10g/11g, SQL, Log4j, XML,XPATH, XSD, HTML, TFS, JUnit, CSS.

Java/J2EE Developer

Confidential

Responsibilities:

  • Involved in Analysis, Design, Development, Integration and testing of the application modules.
  • Development of front end using HTML and JSP.
  • Involved in integrating Hibernate with the backend database.
  • Used JDBC API for connection with oracle 9i database.
  • Worked on Eclipse 3.1 IDE in developing and debugging the application.
  • Designed and developed JMS messaging services and Message driven Beans to listen to the messages inthe queue for interactions with the client ordering data.
  • Documentation and giving time estimations.
  • Building administrative pages using JavaScript.
  • Involved in developing the helper classes for the better data exchange between the MVC layers.
  • Worked on fixing defects with Internet Explorer and Fire fox. Also used Fire fox debugger for the same.

Environment: HTML, JSP, Hibernate, JDBC API, Oracle 9i, Spring, WebLogic, Red hat Linux 5.0, JMS,JavaScript.

We'd love your feedback!