We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

GA

PROFESSIONAL SUMMARY:

  • Over all 8 years of IT experience in analysis, design and development using Hadoop, Java and J2EE.
  • 3+ years' experience in Big Data technologies and Hadoop ecosystem projects like Map Reduce, YARN, HDFS, Apache Cassandra, Spark, NoSQL, HBase, Oozie, Hive, Tableau, Sqoop, Pig, Storm, Kafka, HCatalog, Zoo Keeper and Flume.
  • Worked with Big Data distributions Cloudera CDH5, CDH4, CDH3 andHortonworks 2.5.
  • Using Ambari configuring initial development environment usingHortonworksstandalone sandbox and monitoring theHadoopecho system.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Knowledge of Data Analytics and Business Analytics processes.
  • Hands on experience withSparkstreaming to receive real time data using Kafka
  • CreatingSparkSQL queries for faster requests.
  • Experience in ingesting streaming data intoHadoopusingSpark, Storm Framework and Scala.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Experienced in Performance tuning of the ETL.
  • Implemented various ETL solutions as per the business requirement usingInformatica.
  • Experienced with test frameworks for Hadoop using MRUnit.
  • Performed data analytics using PIG, Hive, and Language R for Data Scientists within the team.
  • Worked extensively on Data Visualization tool Tableau, Graph Data Base like Neo4J.
  • Worked on 32+ node Apache/Cloudera 5.9.2 Hadoop cluster for PROD Environment and used tools like sqoop, Flume for data ingestion from different sources to Hadoop system and Hive/Sparksql to generate reports for analysis.
  • Experience in managing and reviewing Hadoop log files.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Responsible for smooth error - free configuration of DWH-ETL solution and Integration with Hadoop.
  • Extending HIVE and PIG core functionality by using custom User Defined Functions(UDF), User Defined Tables-Generating Functions(UDTF) and User Defined Aggregating Functions(UDAF)
  • Expertise in developing enterprise applications based on J2EE Technologies like JDBC, Servlets, JSP, Struts, Stripes, EJB, Spring, Hibernate.
  • Good Understanding of RDBMS through Database Design, writing queries using databases like Oracle, SQL Server, DB2 and MySQL.
  • Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • A team player and self-motivator possessing excellent analytical, communication, problem solving, decision-making and Organizational skills

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Mapreduce, HBase, Pig, Hive, Sqoop, Flume, Cassandra, Impala, Oozie, Zookeeper, MapR, Amazon Web Services, EMR, MRUnit, Spark, Storm,R, R studio.

Java & J2EE Technologies: Core Java, JDBC, Servlets, JSP, JNDI, Struts, Spring, Hibernate and Web Services (SOAP and Restful)

IDE s: Eclipse, MyEclipse, IntelliJ

Frameworks: MVC, Struts, Hibernate, Spring

Programming languages: C,C++, Java, Python, Linux shell scripts, RDatabases: Oracle 11g/10g/9i, MySQL, DB2, MS-SQL Server, MongoDB, Graph DB

Web Servers: Web Logic, Web Sphere, Apache Tomcat

Web Technologies: HTML, XML, JavaScript, AJAX, Restful WS

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

ETL Tools: Informatica, Qlikview and Cognos

PROFESSIONAL EXPERIENCE:

Confidential, GA

Hadoop Developer

Responsibilities:

  • Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
  • Wrote the Map Reduce jobs to parse the web logs which are stored in HDFS.
  • Importing and exporting data into HDFS and HIVE using Sqoop.
  • Develop HIVE queries for the analysis, to categorize different items.Worked on Big Data Integration and Analytics based onHadoop, SOLR, Spark, Kafka, Storm and web Methods technologies.
  • Created Hive queries to compare the raw data with EDW tables and performing aggregates.
  • Having good experience on all flavors ofHadoop(Cloudera,Hortonworks, MapR) etc.
  • Involved in working with Impala for data retrieval process.
  • Experience in partitioning the Big Data according the business requirements using Hive Indexing, partitioning and Bucketing.
  • Responsible for design development of Spark SQL Scripts based on Functional Specifications.
  • Responsible for Spark Streaming configuration based on type of Input Source
  • Developed the services to run the Map-Reduce jobs as per the requirement basis.
  • Responsible for loading data from UNIX file systems to HDFS. Installed and configured Hive and also written Pig/Hive UDFs.
  • Responsible to manage data coming from different sources.
  • Developing business logic usingscala.
  • Writing MapReduce (Hadoop) programs to convert text files into AVRO and loading into Hive (Hadoop) tables
  • Implemented the workflows using Apache Oozie framework to automate tasks.
  • Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Developing design documents considering all possible approaches and identifying best of them.
  • Used SYNC SORT to perform sorting, merging copying functions and also Processed files using different SYNC SOR Tcontrol statements.
  • Managed and guided the deposit team during their move to Hadoop usingSyncsorttool - Deposit application was re-written from mainframe ETL to Hadoop to load into W (Teradata warehouse) - Used DMX-Hsyncsortfor ETL.
  • Involved in EDW mappings, sessions and workflows
  • Loading Data into HBase using Bulk Load and Non-bulk load.
  • Developed scripts and automated data management from end to end and sync up b/w all the clusters.
  • Import the data from different sources like HDFS/HBase into SparkRDD
  • Working with real time streaming applications using tools like Spark Streaming,Stormand Kafka.
  • Worked on monitoring and troubleshooting the Kafka-Storm-HDFS data pipeline for real-time data ingestion in Datalake in HDFS.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop.
  • Experienced with Spark Context,Spark-SQL, Data Frame, Pair RDD's,SparkYARN.
  • Involved in converting Hive /SQL queries into Spark transformations using Spark RDD, Scala and Python.
  • Involved in gathering the requirements, designing, development and testing.
  • Developing traits and case classes etc inscala.

Environment: HDFS, Map Reduce,Hive, Spark, Flume, Hortonworks, Ambari, Pig, Hbase, Oozie, Sqoop, Java, Maven,Scala, Impala, Python, AngularJs, Splunk, Oracle, SYNCSORT, Yarn, GitHub, Junit, Tableau, Unix, Cloudera, Flume, Tomcat.

Confidential, CA

Hadoop Developer

Responsibilities:

  • Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation, administration and support forHadoop.
  • Installed and Configured Apache Hadoopclusters for application development andHadooptools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
  • Worked with the team to increase cluster from 28 nodes to 42 nodes, the configuration for additional data nodes was done by Commissioning process inHadoop.
  • Involved in creating Spark cluster in HDInsight by create Azure compute resources with spark installed and configured.
  • Involved in implementing an HD Insight version 3.3 cluster, which is based on spark version 1.5.1.
  • Good knowledge in using components that are used in cluster such as spark core (Includes Spark core, Spark SQL, Spark streaming API's.)
  • Managed and scheduled Jobs on a Hadoopcluster.
  • Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, manage and review data backups and log files.
  • Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
  • Experience in converting MapReduce applications to Spark.
  • Involved in defining job flows, managing and reviewing log files.
  • Installed Oozie workflow engine to run multiple Map Reduce, Hive HQL and Pig jobs.
  • Collected the log data from web servers and integrated into HDFS using Flume.
  • Cassandra developer: Set-up configured and optimized the Cassandra cluster. Developed real-time java based application to work along with the Cassandra database.
  • Involved in HDFS maintenance and administering it through Hadoop-Java API.
  • Created and maintained Technical documentation for launching HADOOPClusters and for executing Hive queries and Pig Scripts.
  • Constructed System components and developed server side part using Java, EJB, and Spring Frame work. Involved in designing the data model for the system.
  • Used J2EE design patterns like DAO, MODEL, Service Locator, MVC and Business Delegate.
  • Defined Interface Mapping between JDBC Layer and Oracle Stored Procedures.
  • Experience in managing and reviewing Hadooplog files.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Worked on tuning the performance Pig queries.
  • Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Component unit testing using Azure Emulator
  • Analyze escalated incidences within the Azure SQL database
  • Implemented test scripts to support test driven development and continuous integration.

Environment: Hadoop, Map Reduce, Spark, shark, Kafka, HDFS, Zoo Keeper, Hive, Pig, Oozie, Core Java, Eclipse, HBase, Sqoop, Flume, Oracle 11g, Cassandra, SQL, SharePoint, Azure 2015, UNIX Shell Scripting.

Confidential, CA

Hadoop Developer

Responsibilities:

  • Evaluated suitability ofHadoopand its ecosystem to the above project and implemented various proof of concept (POC) applications to eventually adopt them to benefit from the Big DataHadoop initiative.
  • Estimated Software & Hardware requirements for the Name Node and Data Node & planning the cluster.
  • Extracted the needed data from the server into HDFS and Bulk Loaded the cleaned data into HBase.
  • Written the Map Reduce programs, Hive UDFs in Java where the functionality is too complex.
  • Involved in loading data from LINUX file system to HDFS
  • Develop HIVE queries for the analysis, to categorize different items.
  • Designing and creating Hive external tables using shared meta-store instead of derby with partitioning, dynamic partitioning and buckets.
  • Given POC of FLUME to handle the real time log processing for attribution reports.
  • Sentiment Analysis on reviews of the products on the client's website.
  • Exported the resulted sentiment analysis data to Tableau for creating dashboards
  • Used Map Reduce JUNIT for unit testing..
  • Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).
  • Reviewing peer table creation in Hive, data loading and queries.
  • Monitored System health and logs and respond accordingly to any warning or failure conditions.
  • Responsible to manage the test data coming from different sources.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
  • Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
  • Involved unit testing, interface testing, system testing and user acceptance testing of the workflow tool.

Environment: ApacheHadoop, HDFS, Hive, Map Reduce, Java, Flume, Cloudera, Oozie, MySQL, Linux, Core Java.

Confidential

Java developer

Responsibilities:

  • Involving in design, development, testing and implementation of the process systems, working on iterative life cycles business requirements, and creating Detail Design Document.
  • Developed various helper classes needed using Multithreading.
  • Using Agile methodologies to plan work for every iteration and used continuous integration tool to make the build passes before deploying the code to other environments.
  • Worked onJavaScript libraries like JQuery and JSON
  • Designed and developed web-based software using Spring MVC Framework and Spring Core.
  • Worked on the JAVA Collections API for handling the data objects between the business layers and the front end.
  • Performed UNIX administration (L1) activities and worked closely with application support teams in deploying jobs in production server.
  • Implemented Controller Classes and Server side validations for account activity, payment history and transactions.
  • Implemented session beans to handle business logic for fund transfers.
  • Used Spring ORM module to integrate with Hibernate.
  • Designed and developed Web Services to provide services to the various clients using Restful.
  • Designed the user interface of the application using EXT JS, HTML5, CSS3, JSF 2.1, JavaScript and AJAX.
  • Extensive experience on modern frontend template in frameworks for JavaScript including AngularJS, JQuery.
  • Implemented Hibernate Framework to connect to database and mapping of java objects to database tables
  • Used Web Logic server for deploying the application.
  • Involved in writing the Maven build file to build and deploy the application.
  • Used Log4J to capture the logging information and JUnit to test the application classes.
  • Used Clear Case for Source Code maintenance.

Environment: Java, JSP,JavaScript, JSTL, AJAX, XML, EXT JS, jQuery, AngularJS, Spring MVC Framework, Spring Tool Suite, Oracle 11g, Rational Rose, Log4j, JUnit, Maven, Web Logic, Web Services, SOAP, WSDL.

Confidential

Jr. Java Developer

Responsibilities:

  • Developed HTML and JSP pages for user interaction and presenting the dynamically generated data on the client side by extensively using JSP tag libraries.Involved in client requirements gathering and responsible for understanding and execution of requirements.
  • Involved in client requirements gathering and responsible for understanding and execution of requirements.
  • Responsible in implementing various J2EE design patterns like Service Locator, Business Delegate, Session Façade and Factory Pattern.
  • Used Struts framework in UI designing and validations.
  • Responsible for designing J Applets using SWING and embedding them into web pages.
  • Involved in developing action classes, which acts as a controller in Struts framework.
  • Used PL/SQL to design complex queries and retrieve data from database.
  • Involved in Unit testing using Junit test cases and test suites.
  • Involved in designing and deploying the EJB.

Environment: Java, J2EE, Struts, Spring, Oracle, EJB, Eclipse, JSP, JSTL, JavaScript, PL/SQL, HTML, Swing, UML, XML, JAX-RS, JUnit and SVN.

We'd love your feedback!