Sr. Java Developer/hadoop Developer Resume
Irvine, CA
PROFESSIONAL SUMMARY:
- Over 9+ years of experience on Big Data Technologies, working knowledge in Hadoop involved in Developing, Implementing, configuring, testing Hadoop ecosystem components and its stack including big data analytics and expertise in application Design and Development in various domains with an emphasis on Data warehousing tools using industry accepted methodologies and procedures
- 3+ years of experience in Hadoop, MapReduce, YARN, Hive, Pig, HBase, Sqoop, Flume, Impala, Drill and Oozie and four years of java application development
- Experienced in installing, configuring, and administrating Hadoop cluster of major Hadoop distribution Cloudera
- Good experience in working on CDH3 and CDH4 Cloudera distributions
- Well versed and experienced in data architecture including data ingestion pipeline design, Hadoop information architecture
- Sound knowledge in programming Scala and Eco System
- Involved in implementing Spark - QL
- Extensive work experience in the areas of Banking, Finance, Insurance and Marketing Industries
- Experienced Hadoop Developer, have a strong background with file distribution systems in a big-data arena. Understands the complex processing needs of big data and have experience developing codes and modules to address those needs
- RDBMS experience includes RDBMS and programming using PL/SQL, SQL
- Experience working with Hive data warehouse system, developing data pipelines, implementing complex business logic and optimizing hive queries
- Expertize in using Talend tool for ETL purposes (data migration, cleansing)
- Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager, Application Master and Map Reduce concepts
- Experience in installing, configuring, and administrating Hadoop cluster for various distributions like Cloudera and Horton works
- Good Experience on NOSQL databases like MongoDB and HBase
- Experience in using Pig, Hive, Scoop, Oozie, Ambari and Cloudera Manager
- Knowledgeable of Spark and Scala win framework exploration for transition from Hadoop/Map Reduce to Spark
- Experience in using Apache Flume for collecting, aggregating and moving large amounts of data from application servers
- Experience in using Flume to load logs files into HDFS
- Hands-on experience with message brokers such as Apache Kafka, IBM WebSphere
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa
- Extensive experience in Automation Testing using Quick Test Professional (QTP 11.0, QTP 10.0) in development, execution and maintenance of QTP scripts, Actions, QTP Test Entities, Checkpoints, Synchronization points, Parameterize Tests using Data table, descriptive programming
- Generated Java APIs for retrieval and analysis on No-SQL database such as HBase and Cassandra
- Analyzed data on Cassandra cluster by running queries for searching, sorting and grouping
- Hands on experience in RDBMS, and Linux shell scripting
- Extending Hive and Pig core functionality by writing custom UDFs
- Experience in analyzing data using HiveQL, Pig Latin and Map Reduce
- Developed Map Reduce jobs to automate transfer of data from HBase
- Knowledge in job work-flow scheduling and monitoring tools like Oozie and Zookeeper
- Knowledge of data warehousing and ETL tools like Informatica and Pentaho
- Knowledge on coding skills Kafka and Spark
- Expertise in RDBMS like MS SQL Server, RDBMS, DB2 and NoSQL like MongoDB, HBase
- Knowledge of job workflow scheduling, monitoring and of administrative tasks such as installing Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive, PIG, Sqoop, impala, Drill and HBase
- Knowledge in Software Development Life Cycle (Requirements Analysis, Design, Development, Testing, Deployment and Support)
- Experience in writing Complex SQL Queries involving multiple tables inner and outer joins
- Experience in optimizing the queries by creating various clustered, non-clustered indexes and indexed views using and data modelling concepts. Experience with Oracle 9i -PL/SQL programming and SQL * Plus
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support
- An easy going, hardworking, reliable and a good communicator who can translate complex information, in easy to understand ways with problem solving and leadership skills
TECHNICAL SKILLS:
Hadoop EcoSystem: MapReduce, Hive, PIG, Sqoop, Flume, Oozie, Impala, Drill, Spark, Storm, Kafka, ZooKeeper
Hadoop Architectures: MR1 and MR2 (or) YARN
Distributions: Cloudera, Hortonworks, Map-R
DBMS: MYSQL, Oracle, HBase, Cassandra, MongoDB
Languages: Java, HQL, PIG LATIN, SQL, Python, Shell Scripting, Scala
OS and Servers: Linux, Windows and Apache Tomcat
Web Technologies: HTML, JavaScript, XML, jQuery, JDBC, JSP
IDEs: Eclipse, NetBeans, WinSCP, SQL Developer, Sqoop, Putty
Cloud Platforms: AWS, Azure
Virtualization: VMware, Virtual Box
PROFESSIONAL EXPERIENCE:
Confidential, Irvine, CA
Sr. Java Developer/Hadoop Developer
Responsibilities:
- Involved in transferring data from RDBMS to HDFS using Sqoop.
- Written map-reduce jobs according to the analytical requirements.
- Developed java programs to clean the huge datasets and for preprocessing.
- Responsible in creating Pig scripts and analyzing from the large datasets.
- Implementing Spark using Scala and Spark SQL for faster testing and processing of data Using Spark streaming consumes topics from distributed messaging source Kafka and periodically pushes batch of data to Spark for processing.
- Involved in parsing JSON data into structured format and loading into HDFS/Hive using spark streaming
- Involved with different kind of files such as text and xml data.
- Working with Spark eco system using Spark SQL and Scala queries on different data formats like Text file, CSV file
- Deeply involved in writing complex Spark -Scala scripts, written UDF’s, Spark context, Cassandra SQL context, used multiple API's, methods which support data-frames, RDD's, data-frame Joins, Cassandra table joins and finally write/save the data-frames/RDD's to Cassandra database.
- Actively involved in tuning SQL queries for better performance.
- Used Oracle SQL developer for the writing queries or procedures in SQL.
- Used JDBC to access Database with Oracle thin driver of Type-3 for application optimization and efficiency.
- Used Log4J for extensible logging, debugging and error tracing.
- Used Struts tag libraries as well as Struts tile framework.
- Used Data Access Object to make application more flexible to future and legacy databases.
- Actively involved in tuning SQL queries for better performance.
- Involved in developing pig scripts.
- Interacted and reported the fetched results to BI department.
- Involved in coding of JSP pages for the presentation of data on the View layer in MVC architecture.
- Involved in requirements gathering, analysis and development of the Insurance Portal application.
- Used J2EE design patterns like Factory Methods, MVC, and Singleton Pattern that made modules and code more organized, flexible and readable for future upgrades.
- Worked with JavaScript to perform client side form validations.
- Wrote generic functions to call Oracle stored procedures, triggers, functions.
- Used JUnit for the testing the application in the testing servers.
- Providing support for System Integration Testing & User Acceptance Testing.
- Involved in resolving the issues routed through trouble tickets from production floor.
- Participated in Technical / Functional Reviews.
- Involved in Performance Tuning of the application.
- Need to discuss with the client and the project manager regarding the new developments and the errors.
- Involved in Production Support and Maintenance.
Environment: JDK, UML, JSP, JDBC, XHTML, JavaScript, MVC, XML, XML, Schema, Tomcat, Eclipse, CDH, Hadoop, HDFS, Pig, AWS, RDBMS and MapReduce
Confidential, Menlo Park, CA
Hadoop Developer
Responsibilities:
- Designed and Implemented the ETL Process using Informatica power center.
- ETL flows are developed from Source to Stage, Stage to Work tables and Stage to Target Tables.
- Facilitated knowledge transfer sessions.
- Involved in loading data from UNIX file system to HDFS.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Responsible to manage data coming from different sources.
- Load and transform large sets of structured, semi structured and unstructured data.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Involved in review of functional and non-functional requirements.
- Experienced in defining job flows.
- Wrote recommendation engine using mahout.
- Designed and presented plan for POC on Apache Storm
- Extracted files from Couch-DB through Sqoop and placed in HDFS and processed.
- Worked with AWS cloud service used RedShift and EC2.
- Supported Map Reduce Programs those are running on the cluster.
- Apache Kafka for the durable messaging layer and Apache Storm for the real-time transaction scoring.
- Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform. This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Tuned Mappings and Mapplets for best Performance on ETL Side and Created Indexes and Analyzed tables periodically on Database side.
- Designed and implemented MapReduce-based large-scale parallel relation-learning system.
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarked Hadoop/HBase clusters for internal use.
- Setup Hadoop cluster on Amazon EC2 using whirr for POC.
Environment: Informatica Power Center 9.1, Oracle 10g, Flat files, TOAD, SQL, PL/SQL, SQL Workbench, Putty, Java 6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, Storm, Kafka, HBase, Linux, MapReduce, HDFS, Spark, Java (JDK 1.6), Hadoop Distribution of HortonWorks and Cloudera, MapReduce, DataStax, AWS, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.
Confidential, Bloomington, IL
Hadoop Developer
Responsibilities:
- Worked on Hive queries to categorize data of different claims.
- Integrated the Hive warehouse with HBase.
- Developed MapReduce programs to transform the raw data on which core processing takes place.
- Developed multiple MapReduce jobs in java for data cleaning and preprocessing for this purpose.
- Involved in loading data from LINUX file system to HDFS.
- Written customized Hive UDF’s in Java where the functionality is too complex.
- Used Hive to manipulate data in Cloudera big data platform.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Designing and creating Hive external tables using dynamic partitioning and buckets.
- Responsible to manage the test data coming from different sources Reviewing peer table creation in Hive, data loading and queries.
- Monitored System performance and logs and respond accordingly to any warning or failure conditions.
- Design technical solution for real-time analytics using Kafka, Storm, and HBase.
- Working Knowledge on real time batch processing.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs involved unit testing, interface testing, system testing and user acceptance testing of the workflow tool.
Environment: Hadoop, Cloudera, HDFS, Hive, MapReduce, Core Java, Pig, Java (jdk1.6), XML, JSON, Cloudera, Tableau, Oracle11g/10g, PL SQL, SQL *PLUS, Toad 9.6, Windows NT, UNIX Shell, Scripting.
Confidential, Westbrook, ME
ETL Developer
Responsibilities:
- Responsible for definition, development and testing of processes/programs necessary to extract data from operational databases, Transform and cleanse data, and Load it into data warehouse using Informatica Power center.
- Created the repository manager, users, user groups and their access profiles.
- Extracted data from different sources like SQL server 2005, flat files and loaded into oracle DWH.
- Created complex mappings in Power Center Designer using Expression, Filter, Sequence generator, Update Strategy, Joiner and Stored procedure transformations.
- Created connected and unconnected Lookup transformations to look up the data from the source and target tables.
- Wrote SQL, PL/SQL, stored procedures for implementing business rules and transformations.
- Used data miner to process raw data from flat files.
- Used the update strategy to effectively migrate data from source to target.
- Wrote SQL, PL/SQL, stored procedures for implementing business rules and transformations.
- Responsible for writing SQL queries, stored procedures, views, triggers.
- Developed and documented Mappings/Transformations, Audit procedures and Informatica sessions.
- Used Informatica Power Center Workflow to create sessions and run the logic embedded in the mappings.
- Created test cases and completed unit, integration and system tests for Data warehouse.
Environment: Informatica Power Centre 8.1.1, oracle, Flat files, SQL server 2005, SQL, PL/SQL, Windows 2003.
Confidential
Java/J2EE developer
Responsibilities:
- Responsible for gathering and analyzing requirements and converting them into technical specifications.
- Used Rational Rose for creating sequence and class diagrams.
- Developed presentation layer using JSP, Java, HTML and JavaScript.
- Used Spring Core Annotations for Dependency Injection.
- Designed and developed a 'Convention Based Coding' utilizing Hibernates persistence framework and O-R mapping capability to enable dynamic fetching and displaying of various table data with JSF tag libraries.
- Designed and developed Hibernate configuration and session-per-request design pattern for making database connectivity and accessing the session for database transactions respectively.
- Used HQL and SQL for fetching and storing data in databases.
- Participated in the design and development of database schema and Entity-Relationship diagrams of the backend Oracle database tables for the application.
- Implemented web services with Apache Axis.
- Designed and Developed Stored Procedures, Triggers in Oracle to cater the needs for the entire application. Developed complex SQL queries for extracting data from the database.
Environment: Java, JDK 1.5, Ajax, Oracle 10g, Eclipse, Apache Axis, Apache Ant, Web Logic Server, JavaScript, HTML, CSS, XML
Confidential
Java Developer
Responsibilities:
- Performed in various phases of the Software Development Life Cycle (SDLC)
- Developed user interfaces using JSP framework with AJAX, Java Script, HTML, XHTML and CSS
- Performed the design and development of various modules using CBD Navigator Framework
- Deployed J2EE applications in Web sphere application server by building and deploying ear file using ANT script.
- Created tables, stored procedures in SQL for data manipulation and retrieval.
- Used technologies like JSP, JavaScript and Tiles for Presentation tier.
- CVS tool is used for version control of code and project documents.
Environment: JSP, JDK, JDBC, XML, JavaScript, HTML, Spring MVC, JSF, Oracle 8i, Sun Application Server, UML, JUnit, JTest, NetBeans, Windows 2000.