Sr Hadoop Developer Resume
Germantown, MD
SUMMARY:
- Over 8+ years of experience with multinational clients which includes 4 years of Hadoop related architecture experience developing Bigdata / Hadoop applications.
- Hands on experience with the Hadoop stack (MapReduce, HDFS, Sqoop, Pig, Hive, YARN, HBase, Flume, Oozie and Zookeeper, Spark, Kafka)
- Very well experienced in designing and developing both server side and client side applications.
- Experience in developing custom Map Reduce Programs in Java using Apache Hadoop for analyzing Big Data as per the requirement.
- Experience in extending HIVE and PIG core functionality by using custom UDF’s.
- Worked extensively with Hive DDLs and Hive QLs.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa.
- Used Flume to channel data from different sources to HDFS.
- Having experience in developing a data pipeline using Kafka to store data into HDFS.
- Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batchprocessing.
- Worked on NoSQL databases including HBase, Cassandra and Mongo DB.
- Experience in managing and reviewing Hadoop Log files
- Assisted in Cluster maintenance, Cluster Monitoring, Managing and Reviewing data backups and log files.
- Experience in HBase Cluster Setup and Implementation.
- Experience with Hortonworks/Cloudera/MapR/Amazon Web Services distributions.
- Detailed knowledge and experience of Design, Development and Testing Software solutions using Java and J2EE technologies.
- Good experience in Tableau Desktop, Tableau Server, Tableau Reader in various versions ofTableau 6, Tableau 7, Tableau 8.x and Tableau 9.x.
- Have good knowledge to build authorized data source on Netezza & Hadoop
- Familiar with Java virtual machine (JVM) and multi-threaded processing
- Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle.
- Good Knowledge in Creating ETL Jobs through Talend to load huge volumes of data into Cassandra,Hadoop Ecosystem and relational databases
- Knowledge in installing, configuring and deploying Hadoop distributions in cloudenvironments(Amazon Web Services).
- Good knowledge on Data Modelling and Data Mining to model the data as per business requirements.
- Experienced in all facets of Software Development Life Cycle (Analysis, Design, Development, Testing and maintenance) using Waterfall and Agile methodologies.
- Strong Problem Solving and Analytical skills and abilities to make Balanced & Independent Decisions.
- Good Team Player, Strong Interpersonal, Organizational and Communication skills combined with Self-Motivation, Initiative and Project Management Attributes.
TECHNICAL SKILLS:
Big data/Hadoop Ecosystem: HDFS, Map Reduce, HIVE, PIG,Storm, Sqoop, Flume,Spark,Impala, YARN, Kafka,Oozie and Zookeeper
Hadoop Distributions: Cloudera, Hortonworks
Java / J2EE Technologies: Core Java, Servlets, JSP, JDBC, XML
Programming Languages: C, C++, Java, Python, Scala, SQL, PL/SQL, Linux shell scripts.
NoSQL Databases: HBase, MongoDB, Cassandra
Database: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, Teradata.
Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX.
Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2, Junit.
Tools: Used: Eclipse, NetBeans
Operating System: Ubuntu (Linux), Windows, Mac OS, Cent OS
ETL Tools: Tableau, Talend
Testing: Hadoop Testing, Hive Testing, Quality Center (QC)
PROFESSIONAL EXPERIENCE:
Confidential, Germantown, MD
Sr Hadoop Developer
Responsibilities:
- Worked on importing data from various sources and performed transformations using MapReduce, Hive to load data into HDFS.
- Responsible for building scalable distributed data solutions using Hadoop.
- Written various Hive and Pig scripts.
- Worked on tuning the performance of Hive queries.
- Created HBase tables to store variable data formats coming from different portfolios
- Performed real time analytics on HBase using Java API and Rest API.
- Implemented HBase Co-processors to notify Support team when inserting data into HBase Tables.
- Configured Sqoop jobs to import data from RDBMS into HDFS using Oozie workflows.
- Worked on setting up Pig, Hive and HBase on multiple nodes and developed using Pig, Hive, HBase and MapReduce.
- Implemented Partitioning, Dynamic partitions and Bucketing in Hive.
- Worked on compression mechanisms to optimize MapReduce Jobs.
- Analyzed the customer behavior by performing click stream analysis and to ingest the data used flume.
- Developed Spark scripts by using Scala shell commands as per the requirement.
- Migrated existing MapReduce programs to Spark using Scala and Python.
- Creating RDD's and Pair RDD's for Spark Programming.
- Solved small file problem using Sequence files processing in Map Reduce.
- Implemented business logic by writing UDF's in Java and used various UDF's from Piggybanks and other sources.
- Continuous monitoring and managing the Hadoopcluster using Cloudera Manager
- Worked on Oozie workflow to run multiple jobs.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Analyzed the Hadoop log files using Pig scripts to oversee the errors.
- Presented data and dataflow using Talend for reusability.
Environment: Horton works, Map Reduce, HBase, HDFS, Hive, Pig, Java, SQL, Cloudera Manager, Sqoop, Flume, Spark, Oozie, Java (jdk 1.6), Eclipse
Confidential, Milpitas, CA
Hadoop Developer
Responsibilities:
- Developed Java MapReduce programs for the analysis of sample log file stored in cluster.
- Developed Map Reduce Programs for data analysis and data cleaning.
- Developed PIG Latin scripts for the analysis of semi structured data.
- Worked on analyzing Hadoop clusters using Big Data Analytic tools including Map Reduce, Pig and Hive.
- Responsible to manage data coming from different sources.
- Installed, configured, upgraded and administrated Linux Operating Systems.
- Managed patching, monitoring system performance and network communication, backups, risk mitigation,troubleshooting, application enhancements, software upgrades and modifications of the Linux servers.
- Used Hive and created Hive tables and involved in data loading and writing Hive UDFs.
- Optimized the Hive tables using optimization techniques like partitions and bucketing to provide better performance with Hive QL queries.
- Involved in developing and writing Pig scripts and to store unstructured data into HDFS.
- Used Sqoop to import data into HDFS and Hive from other data systems.
- Migration of ETL processes from MySQL to Hive to test the easy data manipulation.
- Developed Hive queries to process the data for visualizing.
- Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
- Used Impala to read, write and query the Hadoop data in HDFS from HBase or Cassandra.
- Utilized capabilities of Tableau such as Data extracts, Data blending, Forecasting, Dash board actions and Table calculations.
- Implemented Spark SQL to connect to Hive to read the data and distributed processing to make highlyscalable.
- Implemented test scripts to support test driven development and continuous integration.
- Responsible for building scalable distributed data solutions using Hadoop.
Environment: HDFS, Map Reduce, Hive, Sqoop, Pig, Impala, HBase, Oozie, CDH distribution, MySQL, Tableau, Java, Eclipse, Shell Scripts, Spark, Windows, Linux.
Confidential, Washington, DC
Hadoop Developer
Responsibilities:
- Developed Java web services as part of functional requirements.
- To configure Hadoop environment in cloud through Amazon Web Services (AWS) and to providea scalable distributed data solution.
- Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
- Developed MapReduce programs in Java for Data Analysis.
- Integrated Kafka with Storm for real time data processing.
- Load data from various data sources into HDFS.
- Worked on Cloudera to analyze data present on top of HDFS.
- Worked extensively on Hive and PIG.
- Worked on large sets of structured, semi-structured and unstructured data.
- Use of Sqoop to import and export data from HDFS to Oracle RDBMS and vice-versa.
- Developed PIG Latin scripts to play with the data.
- Involved in creating Hive tables , loading with data and writing hive queries which will run internally in Map Reduce way.
- Developed MapReduce (YARN) programs to cleanse the data in HDFS obtained from heterogeneous data
- Installation and configuration of Linux for new hardware. sources to make it suitable for ingestion into Hive schema for analysis.
- Written build scripts using ant and participated in the deployment of one or more production systems.
- Involved in Testing and coordination with business in User testing.
Environment: ApacheHadoop 0.20.203, Cloudera Manager (CDH3), HDFS, Java MapReduce, Eclipse, Hive, PIG, Sqoop, and SQL, Oracle 11g, AWS, YARN, Kafka.
Confidential, Herndon, VA
J2EE Developer
Responsibilities:
- Involved in the design of core implementation logic.
- Extensively worked on application development using Spring MVC and Hibernate frameworks.
- Extensively used Spring JDBC Template to implement DAO methods.
- Used WebSphere as an application server and used Apache Maven to deploy and build the application in WebSphere.
- Performed unit testing using JUnit.
- Developed JAX-WS client and JAX-WS web services to coordinate with outer systems.
- Involved in design of data migration strategy to migrate the data from legacy system to Kenan FX 2.0 billing system.
- Involved in the design of staging database as part of migration strategy.
- Developed efficient PL/SQL packages for data migration and involved in bulk loads, testing and reports generation.
- Involved in testing the Business Logic layer and Data Access layer using JUnit.
Environment: Java, J2EE, Spring JDBC, Hibernate, WebSphere, TOAD, Oracle, Kenan Fx-2.0, Chordiant CRM, PL/SQL
Confidential
Jr Java Developer
Responsibilities:
- Member of application development team at Vsoft.
- Implemented the presentation layer with HTML, CSS andJavaScript
- Developed web components using JSP, Servlets and JDBC
- Implemented secured cookies using Servlets.
- Wrote complex SQL queries and stored procedures.
- Implemented Persistent layer using Hibernate API
- Implemented Transaction and session handling using Hibernate Utils
- Implemented Search queries using HibernateCriteria interface.
- Provided support for loans reports for CB&T
- Designed and developed Loans reports for Evans bank using Jasper and iReport.
- Involved in fixing bugs and unit testing with test cases using Junit
- Resolved issues on outages for Loans reports.
- Maintained Jasper server on client server and resolved issues.
- Actively involved in system testing.
- Fine tuning SQL queries for maximum efficiency to improve the performance
- Designed Tables and indexes by following normalizations.
- Involved in Unit testing, Integration testing and UserAcceptance testing.
- Utilizes Java and SQL day to day to debug and fix issues with client processes.
Environment: Java, Servlets, JSP, Hibernate, Junit Testing, Oracle DB, SQL, Jasper Reports, iReport, Maven, Jenkins.