Hadoop Developer Resume LA, CA - Hire IT People

SUMMARY:

Hadoop Developer with over 6 years of IT experience in Big Data Eco - System, AWS, ETL and RDBMS related technologies with domain experience in entertainment, Banking, Automobile, Health Care, Retail and Non-profit Organizations.
Experience in Development, analysis and design of ETL methodologies in all the phases of Data Warehousing life cycle.
Experience in major Big Data components like Spark, HDFS, Pig, Hive, Sqoop, Oozie, Zookeeper and Hbase.
Good Knowledge on distributed systems, HDFS architecture, internal working details of Map Reduce and Spark processing frameworks.
Experience in developing Map Reduce jobs in Java for data cleaning, transformations, pre-processing and analysis. Multiple mappers are implemented to handle data from multiple sources.
Knowledge on Hadoop daemon functionalities, resource utilizations and dynamic tuning in order to make cluster available and efficient.
Written Several Sqoop scripts to load the data directly into HDFS and Hive Tables from different sources.
Experience in Hive Partitioning, bucketing and perform different types of joins on Hive tables and implementing Hive Serde’s like REGEX, JSON , PARQUET and Avro .
More than one year of hands on experience using Spark framework with Scala. Good exposure to performance tuning hive queries and map-reduce jobs in spark framework.
Experience in data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
Experienced in transferring data from different data sources into HDFS systems using Kafka producers, consumers, and Kafka brokers.
Automated for Azure resource creation, query and deployment application POC using AAD ( Azure active directory ) authentication and ARM (Azure resource manager) API.
Designed and Developed various analytical reports from multiple data sources by blending data on a single worksheet in Tableau Desktop
Utilized advance features of Tableau software like to link data from different connections together on one dashboard and to filter data in multiple views at once.
Extensively worked on Java persistence layer in application migration to Cassandra using Spark to load data to and from Cassandra Cluster.
Experience in integration of various data sources like Oracle, SQL server and MS access and non-relational sources like flat files into staging area.
AWS EC2 and Cloud watch services. CI/CD pipeline management through Jenkins. Automation of manual tasks using Shell scripting.
Used Stateless Session Beans to encapsulate the business logic and developed Web Services for the modules to integrate client's API.
Developed core modules in large cross-platform applications using JAVA , J2EE with experience in Java core concepts like OOPS, Multi-threading, Collections and IO. Expertise in developing Spring MVC frameworks.
Involved in performing verification and validation tests to ensure that the developed functionality meets with the specifications and specifications meet business needs.
Imported data into Cassandra using pyspark,scala to process the data.
Experience in designing both time driven and data driven automated workflows using Oozie.
Performed unit testing using MRUnit and JUnit Testing Framework and Log4J to monitor the error log.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, YARN, Map Reduce, Pig, Hive, HBase, ZooKeeper, Oozie, Sqoop, Talend, Impala, Flume, Kafka, Storm and Spark

Cloud Services: Microsoft Azure

NoSQL: Hbase, Cassandra, MongoDB

Programming Languages: Java, Python and Scala

Frameworks: Hibernate, Struts, and Spring

Web Services: REST, SOAP, Tomcat and Web Sphere

Client Technologies: JQUERY, JAVA Script, AJAX, HTML5

Operating Systems: UNIX, Windows, LINUX(Ubuntu)

Web Technologies : JSP, Servelets, Java Scripts, Java Beans

Databases : Oracle 10g/11g, Postgre SQL 4.x/5.x,PostGres

Development Tools : SQL Developer, ANT, Maven, Jenkins

PROFESSIONAL EXPERIENCE:

Confidential, LA, CA

Hadoop Developer

Responsibilities:

Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Zookeeper, Sqoop, flume, Kafka, Spark, Impala, Cassandra with Cloudera.
Identifying churning of a customer in earlier stage s. Performing explanatory data analyses, generate and test working hypotheses, prepare and analyze historical data and identify patterns.
Analyzed the customer behaviour on a setup boxes to derive the real time analytical view of a setup box usage of a customer .
Implemented Snappy Compression and Parquet file format for Staging, Computational Optimization.
Developed Sqoop scripts to import and export data from RDBMS into HDFS, HIVE and handled incremental loading on the customer and transaction information data dynamically.
Imported the data from different sources like HDFS/HBASE into Spark RDD.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN .
Developed Spark CODE using SCALA and Spark-SQL/Streaming for faster testing and processing of data. Developed KAFKA PRODUCER and consumers, HBase clients,Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
Performed job functions using Spark API's in Scala for real time analysis and for fast querying purposes.
Experienced in working with various kinds of data sources such as Teradata and Oracle . Successfully loaded files to HDFS from Teradata, and load loaded from hdfs to hive and impala.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python.
Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily.
Created pipeline for processing structured and unstructured streaming data using spark streaming and stored the filtered data into s3 as parquet files.
Used Scala collection Framework to store and process complex device metadata and other related information.
Written Apache Spark streaming API on Big Data distribution in the active cluster environment.
Developing parser and loader map reduce application to retrieve data from HDFS and store to HBase and Hive.
Designed and Implemented a Microservices container CICD solution within AWS leveraging Jenkins, GitLabs, Docker, Ansible and Kubernetes.
Implemented new projects builds framework using Jenkins & maven as build framework tools.
Connecting with various teams to identify data stitching process.
Used Oozie workflow engine for managing interdependent Hadoop jobs and to automate several types of Hadoop jobs.
Managed Application development using Agile life cycle development methodologies.

Environment: Hadoop 2.7, Map Reducer, Clouder 5.4, Hive 1.2, Spark 1.6, Spark SQL,Teradata, Flume, Kafka, Sqoop 1.4, Oozie 3.0.3, Python, Java (JDK 1.6), MongoDB, Tableau, AWS, Eclipse

Confidential,DALLAS,TX

Hadoop Developer

Responsibilities:

Extensively involved in Design phase and delivered Design documents. Experience in Hadoop eco system with HDFS, HIVE, PIG, SQOOP and Spark with SCALA.
Hands on extracting data from different databases and to copy into HDFS , Hive using Sqoop, Flume and have an expertise in using compression techniques to optimize the data storage.
Used different Serdes for converting JSON data into pipe separated data.
Wrote Map Reduce jobs using the access tokens to get the data from the customers. Developed simple to complex Map Reduce jobs using Hive and Pig for analyzing the data.
Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
Hands on experience in creating HBase tables to load large sets of semi structured data coming from various sources.
Used Pig Latin Scripts and UDF, UDAFS While Analyzing the Unstructured and Semi-Structured Data.
Created Hive UDF for Business Requirements for Reusability of Functioning.
Migrated 20+ TBs of data from different databases (i.e. Oracle, PostgreSQL ) to Hadoop.
Loaded the data into Spark RDD and do in memory data Computation to generate the Output response.
Utilized in-memory processing capability of Apache Spark to process data using Spark SQL, Spark Streaming using PySpark scripts.
Created PySpark scripts to load data from source files to RDDs, create data frames from RDD and perform transformations and aggregations and collect the output of the process.
Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
Imported the data from different sources like HDFS/Hbase into Spark RDD.
Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
Used the Spark - Cassandra Connector to load data to and from Cassandra.
Developed scripts, UDFs using both Data frames/SQL and RDD/Map Reduce in Spark 1.3 for Data Aggregation, queries and writing data back into OLTP system directly or through Sqoop.
Worked on Oozie workflow engine for job scheduling. Involved in Unit testing and delivered Unit test plans and results documents.
Hands on experience in Tableau for Data Visualization and analysis on large data sets, drawing various conclusions.

Environment: Hadoop 2.5, Map Reducer, Cloudera 4.6, HDFS, HBase, Hive 0.1.2, Pig, Spark 1.2, Storm, Flume, Kafka, Sqoop, Oozie, Oracle, Scala, Java (JDK 1.6), Hadoop(Cloudera),Tableau.

Confidential, CHICAGO, ILLINOIS

Java\Hadoop Developer

Responsibilities:

Worked with the Product manager and Team Leader for Gathering the requirement.
Worked on Distributed/Cloud Computing Environment ( Map Reduce/Hadoop, HBase, Hive, Pig, Spark, Sqoop, Oozie etc.).
Loaded the Customer Logs and Data info into the HDFS by Using the FLUME and Sqoop .
Loaded The RDBMS Data from the Oracle, Postgre Sql, and DB2 into The HDFS By using The Sqoop.
Created custom Record Reader, Different Partitioning Techniques, Custom sorting and Shuffling Techniques In complex Map Reduce jobs as per the use cases.
Implement The Compression Techniques while Loading The Data From RDBMS Into HDFS, Hive For Optimizing The Data Storage.
Used The Partition, Dynamic Partitioning, Bucketing Techniques While Creating The hive Tables for Easy analysis of Dynamic Data which is coming from Different Sources.
Worked on reading the different Formats, compressed files from HDFS, Hive .
Developed Hive queries to process the data and generate the data cubes for visualizing.
Designed dynamic client-side JavaScript, codes to build web forms and simulate process for web application, page navigation and form validation.
Implemented the associated business modules using Spring MVC, and Hibernate data mapping.
Developed Restful services to allow communication between the applications using JAXRS and Jersey framework. Developed OAuth workflows for device user management.
Used various Core Java concepts such as Multi-Threading, Exception Handling, Collection APIs to implement various features.
Created jobs in continuous integrated build and testing and deployment using Jenkins, Maven .
Used Oozie workflow engine to manage inter-dependent Hadoop jobs and to automate several types of Hadoop .

Environment: Hadoop, Map Reducer, HDFS, HBase, Hive, Pig, Flume, Sqoop, Oozie, PostgreSql, Java (jdk 1.6), Java Script, Maven, Spring,Hibernate,Restful.

Confidential

Java Developer

Responsibilities:

Prepared High Level and Low Level Design document implementing applicable Design Patterns with UML diagrams to depict components, class level details.
Interacting with the system analysts & business users for design & requirement clarification.
Involved in developing UI (User Interface) using HTML 5, CSS 3, JSP, JQuery, AJAX, Java Script.
Developed Tabbed pages using AJAX with JQuery and JSON for quick view of related content, providing both functionality and ease of access to the user.
Designed dynamic client-side JavaScript, codes to build web forms and simulate process for web application, page navigation and form validation.
Developed API’s using Spring, Spring MVC, Hibernate, Web Services technolnogies.
Used various Core Java concepts such as Multi-Threading, Exception Handling, Collection APIs to implement various features and enhancements.
Implemented the associated business modules using Hibernate data mapping.
Used Collection Framework features like Map, Object, List to retrieve the data from Web Service, manipulate the data to incorporate Business Logic and save the data to Oracle database.
Created jobs in continuous integrated build and testing using Maven.
Implemented Test Cases for my application using Junit libraries.
Log4j were used to log the various debug, error and informational messages at various levels.
Designed, developed and maintained the data layer using the ORM framework in Hibernate.
Involved in Analysis, Design, Development, and Production of the Application and develop UML diagrams.

Environment: Java, JDBC, JSP, JBoss, Servlets, Maven, HTML, AngularJS, Mongo DB, Hibernate, JavaScript, Eclipse, Struts, SQL Server2000.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

La, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship