Hadoop Developer Resume
Tampa, FL
PROFESSIONAL SUMMARY:
- 8 years of experience in designing and developing JAVA application using JAVA technologies, System Analysis, Technical Design, Implementation, Performance Tuning and Testing.
- Around 5 years of experience with Cloudera Hadoop Ecosystems including HDFS, MapReduce (MRV1 and understanding of YARN) and Hadoop tools (Pig, Hive, HBase, Spark, Sqoop, Zookeeper, Oozie, Scala, HUE, Impala).
- Excellent Hands on Experience in developing Hadoop Architecture within the project in Windows and Linux platforms.
- Good knowledge on spark components like Spark SQL, MLib, Spark Streaming and GraphX.
- Involved in implementing SparkSQL
- Analyzing client’s Big Data requirement and transforming it in to the Hadoop eco system with account of performance bottlenecks and tunings on the existing Hadoop infrastructure.
- Proficient in Data migration from existing DBMS or RDBMS to Hadoop files system using Sqoop.
- Expertise in Data load management, importing & exporting data using Sqoop, Apache Kafka, Spark Streaming and FLUME.
- Experience in handling variety of data sets from customer which includes structured as well as unstructured data using HDFS and HBase.
- Developed map programs to perform data transition and analysis using java, hive and pig.
- Excellent Scripting Skills in Pig and Hive Systems.
- Developing predictive analytic using Apache Spark Scala APIs.
- Good Experience in Table Partitioning in Hive and Parameter passing in Pig.
- Built libraries, user defined functions, and frameworks around Hadoop Ecosystems.
- Experience writing Java Map Reduce Jobs, HIVEQL for Data Architects, Data Scientists.
- Good Experience in data loading from Oracle and MYSQL databases to HDFS system using Sqoop (Structure Data) and Flume (Log Files & XML).
- Hands on experience in Sequence files, Combiners, Counters, Dynamic Partitions, Bucketing for best practice and performance improvement.
- Worked with Talend
- Defining job flows in Hadoop environment - using tools like Oozie for using capacity and fair scheduler.
- Experience in Cluster coordination services through Zookeeper.
- Hands on experience in loading unstructured data into HDFS using Flume/Kafka.
- Understanding of loading a streaming data directly in to HDFS using Flume.
- Preparation of proof of concept about Hadoop and micro services.
- Working knowledge of database such as Oracle 11g.
- Strong experience in database design, writing complex MySQL Queries and Stored Procedures
- Excellent understanding and knowledge of NOSQL databases like Mongo DB, HBase and Cassandra.
- Strong problem solving skills, good communication, interpersonal skills and a good team player
- Ability to work in a fast changing environment and learn new technologies effortlessly
TECHNICAL SKILLS:
Hadoop Ecosystem: Hive, Sqoop, Spark Oozie, Flume, PIG, MapReduce, Zookeeper, HUE, impala, Kafka, Scala, Apache Solr, NOSQL, HBase, MongoDB, Cassandra.
Big data Platform: Cloudera Hadoop CDH 4/5, Map reduce(MRV1,MRV2,YARN), Horton works Sandbox
ETL/BI tools: MySQL, Oracle 11g
Programming skills: C, C++, Python, JAVA,PIG Latin
Java IDE: Eclipse
Academic experience: OOCP, Data structures, Algorithm development
Parallel Programming: MPI
Operating system: Windows, Linux(Ubuntu)
PROFESSIONAL EXPERIENCE:
Confidential, Tampa, FL
Hadoop Developer
Responsibilities:
- Coordinated with business customers to gather business requirements and migrated the existing data to Hadoop from RDBMS (MySQL) using Sqoop for processing the data.
- Using Cloudera distribution of Hadoop. At present migration to Horton works sandbox from Cloudera us under development.
- Ingestion of log data to HDFS using Flume.
- Analyzed click stream data using Hadoop components Hive and Pig by querying.
- Implemented Performance tuning in Hive queries for transformations like joins.
- Involved in creating internal and external Hive Tables, loading data, generating partitions and buckets and User Defined Functions for optimizing the categorical distribution over ingested data.
- Design and implement Map-Reduce jobs to support distributed data processing to process large data sets utilizing Hadoop cluster.
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
- Import the data from different sources like HDFS/Hbase into Spark RDD.
- Implemented Spark Core in Scala to process data in memory.
- Worked on Talend to run ETL jobs on the data inHDFS.
- Developed Map Reduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side.
- Exported the business required information to RDBMS from HDFS using Sqoop to make the data available for BI team to generate reports.
- Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Environment: Java 1.6, Hadoop 2.0.0, Map Reduce, HDFS, Sqoop 1.4.3,Hive 0.10.0, Pig 0.11.0, Linux, XML, Eclipse Juno service, Cloudera- CDH3/4 Distribution, Talend, Oracle 11g, MySQL, HBase 0.94.6
Confidential, Foster City, CA
Hadoop Developer
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop clusters for application development andHadoop tools like Oozie, Hive, Pig, HBase, Zookeeper and Sqoop.
- Involved in analyzing system failures, identifying root causes, and recommended course of actions.
- Wrote the shell scripts to monitor the health check ofHadoopdaemon services and respond accordingly to any warning or failure conditions.
- Managing and scheduling Jobs on a Hadoop cluster.
- Designed a data warehouse using Hive.
- Deployed Hadoop Cluster in all modes.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF'S to pre-process the data for analysis.
- Develop Hive queries for the analysts.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Cluster co-ordination services through Zookeeper.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Knowledgeable of Spark and Scala mainly in framework exploration for transition from Hadoop/MapReduce to Spark.
- Designed and developed automated deployment and scaling processes based on Chef for a wide range of server types and application tiers, including Elasticsearch
- Managed and reviewed Hadooplog files.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports
- Worked with highly engaged Informatics, Scientific Information Management and enterprise IT team.
Environment: Hadoop, HBASE, HDFS, Hive, Java (jdk1.6), Cloudera, Pig, Zookeeper, Oozie, Flume.
Confidential, New York, NY
Hadoop Developer
Responsibilities:
- Worked on a Hadoop environment with MapReduce, Sqoop, Oozie, Flume, Hbase, Pig, Hive and IMPALA on a multi node cloud environment
- Used Flume to load unstructured and semi structured data from various sources such as websites and streaming data to cluster
- To configure Hadoop environment in cloud through Amazon Web Services (AWS) and to provide a scalable distributed data solution.
- Automate and provide Control flow to Pig scripts using Shell.
- Created partitioned tables in Hive for best performance and faster querying.
- Utilized Sqoop to import data from various database sources into Hbase using Sqoop scripts by incremental data loading on transactions of customer's data by date.
- Utilized Flume in moving log files generated from various sources into HDFS for processing of data.
- Created workflow in Oozie for Automating tasks of loading data into HDFS and to preprocess using Pig, utilized Oozie for data scrubbing and processing
- Performed extensive data analysis using Hive and Pig.
- Implemented UDFs for providing custom Pig and hive capabilities.
- Worked on designing NoSQL Schemas on Hbase.
- Performed extensive analysis on data with Hive and Pig.
Environment: Hadoop Framework, Hive, MapReduce, Pig, Impala, HDFS, Oozie, Flume Shell Script, NoSQL and AWS.
Confidential, Phoenix, Arizona
Hadoop Developer
Responsibilities:
- Developed and Supported Map Reduce Programs those are running on the cluster.
- Created Hive tables and working on them using Hive QL.
- Handled 2 TB of data volume and implemented the same in Production.
- Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
- Responsible to manage heterogeneous data coming from different sources
- Supported HBase Architecture Design with the Hadoop Architect team to develop a Database Design in HDFS.
- Involved in HDFS maintenance and loading of structured and unstructured data.
- Designed workflow by scheduling Hive processes for Log file data, which is streamed Into HDFS using Flume.
- Wrote Hive queries for data analysis to meet the business requirements.
- Installed and configured Pig and also written Pig Latin scripts.
- Developed Scripts and Batch Job to schedule various Hadoop Program.
- Upgrading the Hadoop Cluster from CDH3 to CDH4 and setup High availability Cluster Integrate the HIVE with existing applications.
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Developed Hive queries to process the data and generate the data cubes for visualizing
Environment: Cloudera Hadoop, CDH 3/4, Hive, Pig, Map Reduce, Oozie, Sqoop, Flume, Eclipse, Hue, MySql, JAVA, Shell, Linux.
Confidential
Java Developer
Responsibilities:
- Developed code for handling exceptions and converting them into Action Messages.
- Used JavaScript for validations and other checking functionality for the UI screens.
- Involved in Struts Based Validation
- Involved in rmation module.
- Designed and developed the user interface layer using JSP, Java Script, Ajax, HTML, CSS.
- Used HTML to control the display, position HTML elements and to handle events in the user interface.
- Used JavaScript objects to handle events on text boxes, forms to call business logic.
- Involved in resolving business technical issues.
- Involved in writing code to invoke Web services in other applications based on the WSDL files
- Used Hibernate ORM to interact with the oracle database to retrieve, insert and update the data.
- Written the JUNIT test cases for the functionalities.
- Developed and tuned the database SQL queries.
- Used Eclipse IDE and Tomcat 5.5 web application server in development.
- Used CVS version control and Clear Quest in bug tracking.
Environment: Java/J2EE,Spring 3, Oracle 10g, JavaScript, CSS, AJAX, JUnit, Log4j,SOAP Web Services, Restful Web Services, Eclipse IDE.
Confidential
JAVA Developer
Responsibilities:
- Responsible for gathering and analyzing requirements and converting them into technical specifications
- Used Rational Rose for creating sequence and class diagrams
- Developed presentation layer using Java, HTML and JavaScript
- Used Spring Core Annotations for Dependency Injection
- Performed Performance Tuning activities using SQL scripts. Involved in scripts preparation using SQL. Designed and developed Hibernate configuration and session-per-request design pattern for making database connectivity and accessing the session for database transactions respectively. Used SQL for fetching and storing data in databases
- Participated in the design and development of database schema and Entity-Relationship diagrams of the backend Oracle database tables for the application
- Implemented web services with Apache Axis
- Designed and Developed Stored Procedures, Triggers in Oracle to cater the needs for the entire application. Developed complex SQL queries for extracting data from the database
- Handling the all types of issues of PRIME and ONLINE application like Interest levied wrongly transaction related, installment plans, statement not generating, reward points not getting properly, EMI conversion, product change etc.
Environment: apache axis, Rational Rose XDE, spring2.5, notepad++, eclipse, JAVA script, HTML, Oracle database 11g, log4j
Confidential
JAVA Developer
Responsibilities:
- This project covered hospital functions, management activities and decision-making. It Provide all-round and all-angle support for the modern hospital.
- Worked on out Confidential t registration module and emergency registration module
- Made all requests and processes controlled by the system.
- Used interface-oriented programming manner improving flexibility and expandability of the System
- Build Database and tables according to client requirements
- Debugged and fixed the problems that were found during the different phases of the project
- Maintenance of the Database and the Systems, also updated the System from time to time.
Environment: Java, JDBC, Servlets, JSP, HTML, JavaScript, Eclipse, Windows 2000,oracle database, Microsoft Excel.