Hadoop Developer Resume Long Island, NY - Hire IT People

SUMMARY:

8+ years of cradle to grave IT experience in industries like Retail/Insurance/Healthcare performing roles ofHadoop Developer, Data Warehousing and Java Developing which also includes extensive usage of Big Datatools like Hadoop, Hive, Pig, Sqoop, Kafka, Flume, Spark and MapReduce programming.
Working experience in MapReduce programming model and Hadoop Distributed File Systems (HDFS).
Performed Architecture design, Data modelling and implementation of Big Data platform.
Hands on experience in major components of Hadoop Ecosystem like Flume, Hbase, Zookeeper, Oozie,Hive, Sqoop, PIG, Apache Falcon and YARN (MR2).
Maintained and optimized AWS infrastructure (EMR EC2, S3, EBS/Provisioned IOPS, AMI, RDS and IAMroles for users/system).
Developed scripts and numerous batch process jobs to schedule various Hadoop programs.
Experience in Amazon, Cloudera and Hortonworks Hadoop distribution.
Worked on importing and exporting data from different databases like Oracle, MySQL, and SQL server intoHDFS using sqoop.
Strong experience in collecting and storing streaming data like log data, twitter data into HDFS using ApacheFlume.
Knowledge in Talend Big data integration for business demands to work towards Hadoop and NoSQL.
Real - Time Data Ingestion using Big Data stack of technologies (STREAMING SPARK).
Used Spark streaming to consume topics from distributed messaging source Kafka and periodically pushbatch of data to spark for real time processing.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Experienced in working with MapReduce Design patterns to solve complex MapReduce programs.
Build AWS secured solutions by creating VPC with private and public subnets.
Involved in creating tables, partitioning, bucketing and creating UDF's in Hive.
Implemented join operations and involved in writing data transformation using PIG Latin.
Extensive knowledge in NoSQL databases like Hbase and Cassandra.
Experienced in using CRUD operations using Hbase Java client API and Rest API.
Good knowledge with Oozie Workflow engine to automate and parallelize Hadoop MR, Hive and PIG jobs.
Excellent team player with multi-tasking ability, detail oriented, quick learner, self-motivated and performingunder pressure in a rapidly changing environment.

TECHNICAL SKILLS:

Big Data Platforms Cloudera, Big Data, Hadoop, Yarn, Map Reduce, PIG, HIVE, Storm, Kafka, Oozie, Impala,Ignite, FLUME, kinesis and SPARK
Languages Java, C++, Python
Databases Oracle, MySQL, SQL Server,No SQL Databases Hbase, Cassandra, MongoDB, Accumulo, Job Scheduling Framework Auto Sys, Quartz Scheduler
Operating Systems Linux, Unix, Windows 7, Windows 8, XP, Windows vista
Hadoop Distribution Cloudera, Horton Works, AWS
Web Technologies HTML, XHTML, Java Script
Data Modelling tools MS Visio, Rational Rose
Work Environments Eclipse

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential, Long Island, NY

Responsibilities:

Involved in requirement analysis, design, coding and implementation.
Designed ETL Data Pipeline flow to ingest the data from RDBMS source to Hadoop using shell script, sqoopand MySQL.
Wrote Hive queries to have a consolidated view of the mortgage and retail data.
Integrated Apache Storm with Kafka to perform web analytics and to perform click stream data from Kafkato HDFS.
Data is loaded back to the Teradata for the BASEL reporting and for the business users to analyze andvisualize the data using Data Meer.
Implemented using Cloudera (CDH 4.5) distribution.
Used Cloudera manager to monitor the Hadoop eco system.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Analyze current data sources and schema based in use case documentation provided, Develop program andscripts to complete data ingestion into Hadoop cluster.
Responsible to manage data coming from different sources
Involved in designing the row key in Hbase to store Text and JSON as key values in Hbase table and designedrow key in such a way to get/scan it in a sorted order.
Supported Map Reduce Programs those are running on the cluster.
Created and maintained technical documentation for launching Hadoop clusters and for executing Hivequeries and Pig Scripts.
The above scripts were written for distribution of query for performance test jobs in Amazon Datalake.
Involved in Hadoop cluster task like adding and removing nodes without any effect to running jobs and data.
Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Shading features.
Orchestrated hundreds of Sqoop scripts, Pig scripts, Hive queries using Oozie workflow and sub-workflows.
Loaded the files from mainframes to Hadoop and files were converted to ASCII format.
Developed Pig Latin scripts for replacing the existing home loans legacy process to the Hadoop and the datais fed to retail legacy mainframes systems.

Environment: Hadoop Distributed File System (HDFS), Spark, MapReduce, Hive, Pig, Sqoop, Kafka, SOAP,Web services, Junit, maven and Oozie.

Hadoop Developer

Confidential, New Brunswick, NJ

Responsibilities:

Involved in requirement analysis, design, coding and implementation.
Responsible for building scalable distributed data solutions using Hadoop Cloudera.
Installed Oozie workflow engine to run multiple Hive and Pig jobs.
Experience in supporting data analysis projects by using Elastic MapReduce on the Amazon Web Services(AWS) cloud, performed Export and import of data into s3.
Processed data into HDFS by developing solutions and analyzed the data using Map Reduce, PIG, and Hiveto produce summary results from Hadoop to downstream systems.
Used Sqoop to import the data from Hadoop Distributed File System (HDFS) to RDBMS.
Established custom Map Reduce programs in order to analyze data and used Pig Latin to clean unwanteddata.
Streamed AWS log group into Lambda function to create service now incident.
Participated in SOLR schema and ingested data into SOLR for data indexing.
Extensive experience in designing and implementing Data Flow pipeline from RDBMS to Hadoop.
Worked on Horton works sandbox.
Worked on S3 buckets on AWS to store Cloud Formation Templates.
Worked on AWS to create EC2 instances.
Worked on various performance optimizations like using distributed cache for small datasets, partition, Bucketing and Map side joins.
Involved in creating Hive tables and applied those HQL on the tables for data validation.
Responsible for installation and configuration of Hive, Pig, Hbase and Sqoop on the Hadoop cluster.
Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data andanalyzed them by running Hive queries and Pig scripts.
Used Zookeeper to manage coordination among the clusters.
Worked with Impala to pull the data from Hive tables.
Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time anddata availability.
Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting, Manage and review data backupsand log files.

Environment: HDFS, Hadoop, Pig, Hive, Sqoop, Flume, MapReduce, Oozie, Mongo DB, Java 6/7, Oracle 10g,Sub Version, Toad, UNIX Shell Scripting, SOAP, REST services, Oracle 10g, Agile Methodology, JIRA, AutoSys.

Hadoop Developer

Confidential, Norwalk, CT

Responsibilities:

Worked on importing and exporting data between DB2 and HDFS using Sqoop.
Used Flume to collect, aggregate and store the web log data from different sources like web servers, mobile and network devices to be pushed into HDFS.
Developed MapReduce programs in Java to convert data from JSON format to CSV and TSV formats toperform analytics.
Developed Pig Latin scripts for cleansing and analysis of semi-structured data.
Experienced in debugging MapReduce jobs and Pig scripts.
Used Pig as ETL tool to do transformations, event joins and pre-aggregations before storing the data intoHDFS.
Experience in creating Hive tables, loading with data and writing hive queries.
Experience in migration of ETL processes from Relational databases to Hive to test the easy datamanipulation.
Used Hive to analyze the partitioned and bucketed data to compute various metrics for reporting.
Written Hive and Pig UDFs to perform aggregation to support the business use case.
Performed MapReduce integration to import large amounts of data into Hbase.
Experience with performing CRUD operations using Hbase Java client API.
Developed shell scripts to automate MapReduce jobs to process data.

Environment: CDH3, Cloudera Manager, Java, shell, SQL, Hadoop, HDFS, Sqoop, Flume, MapReduce, Pig,Hive, Oracle, MongoDB, Hbase, JDK 1.7, TDD and Agile SCRUM.

Java Developer

Confidential

Responsibilities:

Created UML class diagrams that depict the code's design and its compliance with the functionalrequirements.
Used J2EE design patterns for the middle tier development.
Developed EJB's in Web Logic for handling business process, database access and asynchronousmessaging.
Used Java Mail notification mechanism to send confirmation email to customers about scheduled payments.
Developed Message-Driven beans in collaboration with Java Messaging Service (JMS) to communicate withmerchant systems.
Also involved in writing JSP's/JavaScript and Servlets to generate dynamic web pages and web content.
Wrote stored procedures and Triggers using PL/SQL.
Involved in building and parsing XML documents using JAX parser.
Deployed the application on Tomcat Application Server.
Experience in implementing Web Services and XML/HTTP technologies.
Created UNIX shell and Perl utilities for testing, data parsing and manipulation.
Used Log4J for log file generation and maintenance.
Wrote JUnit test cases for testing.

Environment: Java, JDBC, Servlets, JSP, Struts, Eclipse, Oracle 9i, CVS, JavaScript, Log4J, J2EE, JDK6, Java Script, EJB, Web Services, Spring, SOAP, WSDL, Application Server, Oracle 10g/11g, SQL, Log4j, XML,XPATH, XSD, HTML, TFS, JUnit, CSS.

Java/J2EE Developer

Confidential

Responsibilities:

Involved in Analysis, Design, Development, Integration and testing of the application modules.
Development of front end using HTML and JSP.
Involved in integrating Hibernate with the backend database.
Used JDBC API for connection with oracle 9i database.
Worked on Eclipse 3.1 IDE in developing and debugging the application.
Designed and developed JMS messaging services and Message driven Beans to listen to the messages inthe queue for interactions with the client ordering data.
Documentation and giving time estimations.
Building administrative pages using JavaScript.
Involved in developing the helper classes for the better data exchange between the MVC layers.
Worked on fixing defects with Internet Explorer and Fire fox. Also used Fire fox debugger for the same.

Environment: HTML, JSP, Hibernate, JDBC API, Oracle 9i, Spring, WebLogic, Red hat Linux 5.0, JMS,JavaScript.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Long Island, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship