Hadoop Developer Resume Richardson, TX - Hire IT People

SUMMARY:

8 years of IT experience in Software Development, Having 4+ years of experience in Big Data Hadoop and NoSQL technologies in various domains like Automobile, Finance, Insurance, Health care and telecom.
4 years of experience on Hadoop working environment includes Map Reduce, HDFS, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, Yarn, Cassandra, Kafka, Spark and Flume.
Solid understanding of Hadoop Distributed File System.
Good experience with MapReduce (MR), Hive, Pig, HBase, Sqoop, Oozie, Flume, Spark,Zookeeper for data extraction, processing, storage and analysis.
In - depth understanding on how MapReduce works and Hadoop infrastructure.
In depth understanding of Hadoop Architecture and its various components such as Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager and Map Reduce concepts.
Experience in developing custom MapReduce Programs in Java using Apache Hadoop for analyzing Big Data .
Extensively worked on Hive for ETL Transformations and optimized Hive Queries.
Experience in importing and exporting data using Sqoop from Relational Database Systems tHDFS and vice-versa.
Successfully loaded files tHive and HDFS from MongoDB, Cassandra, HBase.
Extending HIVE and PIG core functionality by using custom User Defined Function s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig.
Experience in analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
Developed Pig Latin scripts for data cleansing and Transformation.
Used Flume tchannel data from different sources tHDFS.
Job workflow scheduling and monitoring using tools like Oozie.
Created HBase tables tload large sets of structured, semi-structured and unstructured data coming from Unix and NoSQL.
Good experience in Cloudera, Hortonworks & Apache Hadoop distributions.
Worked with relational database systems (RDBMS) such as MySQL, MSSQL, Oracle and NoSQL database systems like HBase and Cassandra.
Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
Good Knowledge on HadoopCluster architecture and monitoring the cluster.
Used Shell Scripting tmove log files intHDFS.
Good understanding in processing of real-time data using Spark.
Import the data from different sources like HDFS/HBase int Spark RDD.
Experienced with different file formats like CSV, Text files, Sequence files, XML, JSON and Avrfiles.
Good knowledge on Data Modelling and Data Mining tmodel the data as per business requirements.
Involved in unit testing of Map Reduce programs using Apache MRunit.
Good knowledge on python scripting, bash Scripting languages.
Developed Simple tcomplex Map/reduce streaming jobs using Python language that are implemented using Hive and Pig.
Designed and implemented Hive and Pig UDF's using Python for evaluation, filtering, loading and storing of data.
Real streaming the data using Spark with Kafka and store the stream data t HDFS using Scala
Involved in creating database objects like tables, views, procedures, triggers and functions using T-SQL tprovide definition, structure and tmaintain data efficiently.
Expert in Data Visualization development using Tableau tcreate complex and innovative dashboards.
Generated ETL reports using Tableau and created statistics dashboards for Analytics.
Reported the bugs by classifying them and have played major role in carrying out different types of tests viz. Smoke, Functional, Integration, System, Data Comparison and Regression testing.
Experience in creating Master Test Plan, Test Cases, and Test Result Reports, Requirements Traceability Matrix and creating Status Reports and submitting tthe Project management.
Strong hands on experience in MVC frameworks and Spring MVC.
Good in Designing and developing the Data Access Layer modules with the help of Hibernate Framework for the new functionalities.
Extensively experience in working on IDEs like Eclipse, Net Beans and Edit Plus.
Working knowledge of Agile and waterfall development models.
Working experience in all SDLC Phases.
Extensively used Java and J2EE technologies like Core Java, Java Beans, Servlet, JSP, spring, Hibernate, JDBC, JSON Object, and Design Patterns.
Experienced in Application Development using Java, J2EE, JSP, Servlets, RDBMS, Tag Libraries, JDBC, Hibernate, XML and Linux shell scripting.
Worked with different software version control, Jira, bug tracking and code review systems like CVS, Clear Case.

TECHNICAL SKILLS:

Big data/Hadoop Ecosystem: HDFS, Map Reduce, HIVE, PIG, HBase, Sqoop, Flume, Oozie, Storm and Avro

Java / J2EE Technologies: Core Java, Servlets, JSP, JDBC, XML, REST, SOAP, WSDL

Programming Languages: C, C++, Java, Scala, SQL, PL/SQL, Linux shell scripts.

NoSQL Databases: MongoDB, Cassandra, HBase

Database: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, Teradata.

Web Technologies: HTML, XML, JDBC, JSP, JavaScript, AJAX, SOAP

Frameworks: MVC, Struts 2/1, Hibernate 3, Spring 3/2.5/2.

Tools: Used: Eclipse, IntelliJ, GIT, Putty, Winscp

Operating System: Ubuntu (Linux), Win 95/98/2000/XP, Mac OS, RedHat

ETL Tools: Informatica, pentaho.

Testing: Hadoop Testing, Hive Testing, Quality Center (QC)

Monitoring and Reporting tools: Ganglia, Nagios, Custom Shell scripts.

PROFESSIONAL EXPERIENCE:

Confidential, Richardson, TX

Hadoop Developer

Responsibilities:

Imported data from different relational data sources like RDBMS, Teradata tHDFS using Sqoop.
Imported bulk data intHBase Using Map Reduce programs.
Perform analytics on Time Series Data exists in HBase using HBase API.
Designed and implemented Incremental Imports intHive tables.
Used Rest ApI tAccess HBase data tperform analytics.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Involved in converting Map Reduce programs into Spark transformations using Spark RDD's on Scala.
Experienced with batch processing of data sources using Apache Spark, Elastic search.
Worked in Loading and transforming large sets of structured, semi structured and unstructured data
Worked with cloud services like Amazon web services (AWS)
Involved in collecting, aggregating and moving data from servers tHDFS using Apache Flume.
Written Hive jobs tparse the logs and structure them in tabular format tfacilitate effective querying on the log data.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
Developed java Restful web services tupload data from local tAmazon S3, listing S3 objects and file manipulation operations.
Configured a 20-30 node (Amazon EC2 spot instance) Hadoop cluster ttransfer the data from Amazon S3 tHDFS and HDFS tAmazon S3 and alstdirect input and output tthe Hadoop MapReduce framework.
Import the data from different sources like HDFS/Hbase intSpark RDD developed a data pipeline using Kafka and Storm tstore data intHDFS. Performed real time analysis on the incoming data.
Prepared scripts tensure proper data access, manipulation and reporting functions with R programming languages.
Formulated procedures for integration of R programming plans with data sources and delivery systems.
Developed a scalable queuing system taccommodate the ever growing message flows across the systems using Amazon Simple Queuing System and Akka actor models.
Wrote internal and external API services using Node.js modules
Experience with Primavera integration tools such as Primavera Gateway or custom integration experience of Primavera with ERP applications
Developed Simple tcomplex Map/reduce streaming jobs using Python language that are implemented using Hive and Pig
Experienced in managing and reviewing the Hadoop log files.
Successfully ran all Hadoop MapReduce programs on Amazon Elastic MapReduce framework by using Amazon S3 for input and output.
Involve in Data Asset Inventory tgather, analyze, and document business requirements, functional requirements and data specifications for Member Retention from sources SQL / Hadoop.
Involved in Automation of clickstream data collection and store intHDFS using Flume
Worked on solving performance and limit queries tthe workbooks that when it connects tlive database by using a data extract option in Tableau.
Designed and developed Dashboards for Analytical purposes using Tableau.
Migrated ETL jobs tPig scripts do Transformations, even joins and some pre-aggregations before storing the data ontHDFS.
Worked with AvrData Serialization system to work with JSON data formats.
Worked on different file formats like Sequence files, XML files and Map files using Map Reduce Programs.
Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
Exported data from HDFS environment intRDBMS using Sqoop for report generation and visualization purpose.
Worked on Oozie workflow engine for job scheduling.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Pig Scripts.

Environment: CDH 5.3, Map Reduce, Hive0.14, Spark 1.4.1, Oozie, Sqoop, Pig0.11, Java, Rest API, Maven, MRUnit, Junit, Tableau, Cloudera,Python.

Confidential, San Francisco, CA

Hadoop Developer

Responsibilities:

Involved in Automation of clickstream data collection and store intHDFS using Flume.
Involved in creating Data Lake by extracting customer's data from various data sources intHDFS.
Used Sqoop tload data from Oracle Database intHDFS.
Developed MapReduce programs tcleanse the data in HDFS obtained from multiple data sources.
Employed Oracle database tcreate and maintain Data Mart
Contributed in creating ETL designs and provided end users an easy accession tData marts.
Involved in creating Hive tables as per requirement defined with appropriate static and dynamic partitions.
Used Hive tanalyze the data in HDFS tidentify issues and behavioral patterns.
Involved in production Hadoop cluster setup, administration, maintenance, monitoring and support.
Real streaming the data using Spark with Kafka and store the stream data t HDFS using Scala .
Developed Simple tcomplex Map/reduce streaming jobs using Python language that are implemented using Hive and Pig
Involved in Automation of clickstream data collection and store intHDFS using Flume
Experience with Primavera integration tools such as Primavera Gateway or custom integration experience of Primavera with ERP applications
Developed a scalable queuing system to accommodate the ever growing message flows across the systems using Amazon Simple Queuing System and Akka actor models.
Worked with cloud services like Amazon web services (AWS)
Wrote internal and external API services using Node.js modules
Logical implementation and interaction with HBase.
Cluster coordination services through Zookeeper.
Efficiently put and fetched data to/from HBase by writing MapReduce job.
Developed MapReduce jobs tautomate transfer of data from/tHBase.
Created data queries and reports using Qlik view and Excel. Created Customs queries/reports designed for qualifying verification and information sharing.
Assisted with the addition of Hadoop processing tthe IT infrastructure.
Used flume tcollect the entire web log from the online ad-servers and push intHDFS.
Implemented MapReduce job and execute the MapReduce job tprocess the log data from the ad servers.
Extensively used Core Java, Servlets, JSP and XML
Load and transform large sets of structured, semi structured and unstructured data.
Back-endJava developer for Data Management Platform (DMP) and building RESTful APIs tbuild and let other groups build dashboards.
Responsible for building Scalable distributed data solutions using HortonWorks.
Integrated Hive server 2 with Tableau using Horton Works Hive ODBC driver, for autgeneration of Hive queries for non-technical business user.
Worked closely with architect and clients tdefine and prioritize use cases and develop APIs.
Involve in monitoring job performance, capacity planning and workload using Cloudera Manager.

Environment: Hadoop, Pig 0.10, Sqoop, Oozie, MapReduce, HDFS, HBase. Hive 0.10, Core Java, Eclipse,Qlik view, Flume, Cloudera, Horton Works,Oracle 10g, UNIX Shell Scripting, Cassandra.

Confidential, Penfield, NY

Hadoop Developer.

Responsibilities:

Worked on Hadoop cluster (CDH 5) with 30 nodes.
Worked with highly semi-structured and structured data of 90TB with replication factor 3.
Extracted the data from Oracle, MySQL, and SQL server databases intHDFS using Sqoop .
Extracted data from weblogs and social media using flume and loaded intHDFS.
Created jobs in Sqoop with incremental load and populated Hive tables.
Developed software to process, cleanse, and report on vehicle data utilizing various analytics and REST API languages like Java, Scala and Akka (Asynchronous programming Framework)
Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
Involved in Automation of clickstream data collection and store intHDFS using Flume
Wrote internal and external API services using Node.js modules
Experience with Primavera integration tools such as Primavera Gateway or custom integration experience of Primavera with ERP applications
Worked with cloud services like Amazon web services (AWS)
Involved in Developing Assert Tracking project where we use tcollect real-time vehicle location data using IBM streams from JMS queue and processed that data in Vehicle Tracking using ESRI GIS Mapping Software, Scala and Akka Actor Model.
Involved in developing web-services using REST, HBase Native API and BigSQL Client tquery data from HBase.
Experienced in Developing Hive queries in BigSQL Client for various use cases.
Involved in developing few Shell Scripts and automated them using CRON job scheduler
Implemented test scripts tsupport test driven development and continuous integration.
Responsible tmanage data coming from different sources.
Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
Analyzed large amounts of data sets tdetermine optimal way taggregate and report on it.

Environment: Hadoop 1x, Hive 0.10, Pig 0.11, Sqoop, HBase, UNIX Shell Scripting, Scala, Akka, IBM InfoSphere BigInsights, IBM InfoSphere Streams, IBM BigSQL, Java

Confidential, Memphis,TN

Hadoop Developer

Responsibilities:

Worked with structured and semi structured data of approximately 100TB with replication factor of 3.
Involved in complete Implementation lifecycle, specialized in writing custom MapReduce , Pig and Hive programs.
Exported the analyzed data tthe relational databases using Sqoop for visualization and tgenerate reports for the BI team.
Extensively used Hive/HQL or Hive queries tquery or search for a particular string in Hive tables in HDFS.
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins .
Experience in developing customized UDF's in java textend Hive and Pig Latin functionality.
Involved in Automation of clickstream data collection and store intHDFS using Flume
Experience with Primavera integration tools such as Primavera Gateway or custom integration experience of Primavera with ERP applications
Created HBase tables tstore various data formats of data coming from different portfolios.
Managing and scheduling Jobs tremove the duplicate log data files in HDFS using Oozie.
Used Flume extensively in gathering and moving log data files from Application Servers ta central location in Hadoop Distributed File System (HDFS).
Experienced with SOLR for indexing and search.
Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases tfind which one of them better suites the current requirements.
Used File System check ( FSCK ) tcheck the health of files in HDFS.
Developed the UNIX shell scripts for creating the reports from Hive data.

Environment: Java, UNIX, HDFS, Pig, Hive, Spark, Scala, MapReduce, Flume, Sqoop, Kafka, HBase, Cassandra, Cloudera Distribution, Oozie, Ambari, Ganglia, Yarn, Shell scripting

Confidential -Plano, TX

Java/J2EE/Hadoop Developer

Responsibilities:

Participated in requirement gathering and converting the requirements inttechnical specifications.
Created UML diagrams like use cases, class diagrams, interaction diagrams, and activity diagrams.
Developed the application using Struts Framework that leverages classical Model View Controller ( MVC ) architecture.
Extensively worked on User Interface for few modules using JSPs, JavaScript and Ajax.
Created Business Logic using Servlets, POJO s and deployed them on Web logic server.
Wrote complex SQL queries and stored procedures.
Developed the XML Schema and Web services for the data maintenance and structures.
Implemented the Web Service client for the login authentication, credit reports and applicant information using Apache Axis 2 Web Service.
Responsible to manage data coming from different sources.
Developed map reduce algorithms.
Got good experience with NOSQL database.
Involved in loading data from UNIX file system tHDFS.
Installed and configured Hive and also written Hive UDFs.
Integrated Hadoop with Solr and implement search algorithms.
Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 10g database.
Used Hibernate ORM framework with Spring framework for data persistence and transaction management.
Used struts validation framework for form level validation.
Wrote test cases in JUnit for unit testing of classes.
Involved in creating templates and screens in HTML and JavaScript.
Involved in integrating Web Services using SOAP.

Environment: Hive 0.7.1, Apache Solr 3.x, HBase-0.90.x/0.20.x, JDK 1.5,, Struts 1.3, WebSphere 6.1, HTML, XML, JavaScript, JUnit 3.8,Oracle 10g, Amazon Web Services.

Confidential, McLean, VA

Java/J2EE Developer

Responsibilities:

Responsible for gathering business and functional requirements for the development and support of in-house and vendor developed applications
Gathered and analyzed information for developing, supporting, and modifying existing web applications based on prioritized business needs
Played key role in design and development of new application using J2EE, Servlets, and Spring technologies/frameworks using Service Oriented Architecture (SOA)
Wrote Action classes, Request Processor, Business Delegate, Business Objects, Service classes and JSP pages
Played a key role in designing the presentation tier components by customizing the Spring framework components, which includes configuring web modules, request processors, error handling components, etc.
Implemented the Web Services functionality in the application tallow external applications taccess data
Used Apache Axis as the Web Service framework for creating and deploying Web Service Clients using SOAP and WSDL
Worked on Spring to develop different modules tassist the product in handling different requirements
Developed validation using Spring's Validation Interface and used Spring Core and MVC develop the applications and access data
Implemented Spring Beans using IOC and Transaction management features thandle the transactions and business logic
Design and developed different PL/SQL blocks , Stored Procedures in DB2 database
Involved in writing DAO layer using Hibernate taccess the database
Involved in deploying and testing the application using Websphere Application Server
Developed and implemented several test cases using JUnit framework
Involved in troubleshoot technical issues, conduct code reviews, and enforce best practices

Environment: Java SE 6, J2EE 6, JSP 2.1, Servlets 2.5, Java Script, IBM Websphere7, DB2, HTML, XML, Spring 3, Hibernate 3, JUnit, Windows 7, Eclipse 3.5

Confidential, Seattle, WA

Java/J2EE Developer

Responsibilities:

Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
Developed and deployed UIlayerlogics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.
CSS and JavaScript were used to build rich internet pages.
Agile Scrum Methodology been followed for the development process.
Designed different design specifications for application development that includes front-end, back-end using design patterns.
Developed proto-type test screens in HTML and JavaScript.
Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
Developed the application by using the Spring MVC framework.
Collection framework used to transfer objects between the different layers of the application.
Developed data mapping tcreate a communication bridge between various application interfaces using XML, and XSL.
Spring IOC being used tinject the parameter values for the Dynamic parameters.
Developed JUnit testing framework for Unit level testing.
Actively involved in code review and bug fixing for improving the performance.
Documented application for its functionality and its enhanced features.
Created connection through JDBC and used JDBC statements tcall stored procedures.

Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, Junit, Apache Tomcat, My SQL Server 2008.

Confidential

Application Developer

Responsibilities:

Developed the application under JEE architecture, developed, Designed dynamic and browser compatible user interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
Deployed & maintained the JSP, Servlets components on Web logic 8.0
Developed Application Servers persistence layer using JDBC and SQL.
Used JDBC tconnect the web applications tDatabases.
Implemented Test First unit testing framework driven using Junit.
Developed and utilized J2EE Services and JMS components for messaging communication in Web Logic.
Configured development environment using Web logic application server for developers integration testing.

Environment: Java/J2EE, SQL, Oracle 10g, JSP 2.0, EJB, AJAX, Java Script, Web Logic 8.0, HTML, JDBC 3.0, XML, JMS, log4j, Junit, Servlets, MVC

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Richardson, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship