Hadoop Developer Resume
NY
SUMMARY
- 7+ years of experience in analysis, design, development and implementation of large - scale web-based applications using Big Data, Hadoop, Spark, Map-Reduce, Storm, HIVE, Scala HBase, Core Java, J2EE and related Technologies.
- Good Exposure on Apache Hadoop Map Reduce programming, Hive, PIG scripting and HDFS.
- Strong knowledge of Big Data and provided Hadoop solutions to related use cases in Big data.
- Hands on experience in Import/Export of data using SQOOP Hadoop Data Management tool.
- Strong experience in writing Map Reduce programs, Hive and Pig for Data Analysis.
- Worked in writing Spark SQL scripts for optimizing the query performance.
- Hands on experience in writing custom partitioners for Map Reduce.
- Excellent understanding and knowledge of NoSQL databases like HBase.
- Experience with installing, backup, recovery, configuring, DR and development on multiple Hadoop distribution platforms like Hortonworks Distribution Platform(HDP), Cloudera Distribution for Hadoop (CDH).
- Experience with integration of HDP and CDH to data integration tools and BI tools.
- Experience with HDP security implementations.
- Experienced in working with Struts, Hibernate and Spring MVC frameworks.
- Extensively worked with JBoss, IBM WebSphere application servers and Tomcat WebServer.
- Experience working with JAVA, J2EE, JDBC, ODBC, JSP, Java Beans, EJB, Servlets, Java Web Services and related technologies using IDEs like IntelliJ, Eclipse and NetBeans.
- Development experience in DBMS like Oracle, MS SQL Server and MYSQL.
- Experience in writing, diagnosing and solving performance queries in MySQL, DB2 and Stored Procedures in MySQL.
- Exposure to working on Oracle 10.x database using SQL (DML, DDL Queries)
- Experience in implementing projects using Waterfall, RUP, Agile Methodologies and exposure to SCRUM project implementation methodology.
- Hands on Design and development of the projects and mentoring teams technically while working on applications.
- Extensive experience working on CVS, Clear Case, SVN and GIT for Source Controlling.
- Vast experience working on all phases of System Development Life Cycle (SDLC) including, but not limited to Design, Development, Testing, Implementation and Post Production Support.
- Quick learner of business processes with excellent and proven analytical, trouble shooting and problem solving skills.
- Very good at development driven testing and working experience with QA’s.
- Strong Ability to handle multiple tasks and work independently as well as in a team.
- Have Strong analytical skills with ability to follow project standards and decision - making capability.
TECHNICAL SKILLS
ETL Tools: Sqoop, Flume, Pig, Hive, Spark, Hadoop-Streaming, Hive, HBase and Cassandra
Databases: Oracle 11g, IBM DB2 UDB 9.0 (MVS), PL/SQL, Teradata SQL Assistant V13
Reporting Tools: Tableau
Operating System: Linux (Red Hat) AS4, Unix, WINDOWS 95/98/2000/NT/XP/Vista/8/10, MS-DOS
Languages/Utilities: SQL, Scala, SQL Developer, UNIX shell scripts (korn), HTML, C/C++, Visual Basic
PROFESSIONAL EXPERIENCE
Confidential, AZ
Senior Hadoop Developer
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in Design and Development of technical specification documents using Hadoop.
- Managed and reviewed Hadoop log files.
- Migrated the data from Oracle, IBM-ICM in to HDFS using Sqoop and imported various formats of flat files into HDFS.
- Experienced on loading and transforming of large sets of structured, and semi structured data from HBase through Sqoop and placed in HDFS for further processing.
- Monitored Hadoop scripts which take the input from HDFS and load the data into Hive.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
- Developed MapReduce programs to parse the raw data, populate tables and store the refined data in partitioned tables.
- Worked on improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frames, RDD's, Spark YARN.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Wrote pig scripts to implements business transformations
- Developed and written Apache PIG scripts and HIVE scripts to process the HDFS data.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs.
- Defined job work flows as per their dependencies in Oozie.
- Maintain System integrity of all sub-components related to Hadoop.
- Worked on Apache and HDP clusters and integrated with BI tools.
- Loading log data directly into HDFS using Flume.
- Managed and monitored Apache Hadoop clusters using Apache Ambari.
- Developed and involved in the industry specific UDF (user defined functions).
- Migration of ETL processes from Oracle to Hive to test the easy data manipulation.
- Implemented test scripts to support test driven development and continuous integration
Environment: Apache Hadoop, HDFS, Spark, Scala, Hive, Map Reduce, Java, Eclipse, Hive, Pig,Avro,OozieIBM-ICM.
Confidential, TX
Hadoop Developer
Responsibilities:
- Lead and managed team during Design, Development and Implementation phase of the application.
- As a Developer, worked directly with business partners discussing the requirements for new projects and enhancements to the existing applications.
- Wrote Java code to process streams for risk management analysis.
- Wrote extensive shell scripts to run appropriate programs
- Wrote multiple queries to pull data from Hbase
- Reporting on the project based on Agile-Scrum Method. Conducted daily Scrum meetings and updated JIRA with new details.
- Wrote Java to pull related data from Hbase.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
- This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Mapreduce-based large-scale parallel relation-learning system
- Involved in review of functional and non-functional requirements.
- Facilitated knowledge transfer sessions.
- Installed and configured Hadoop Mapreduce, HDFS and developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in defining job flows.
- Worked on HDP security implementations
- Wrote Pig Scripts to perform ETL procedures on the data in HDFS.
- Analyzed the data by performing Hive queries and running Pigscripts and Python Scripts.
- Used Hive to partition and bucket data.
- Used Tableau for Data Visualization.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Got good experience with NoSQL database.
Environment: Java 1.6, Hadoop 2.2.0 (Yarn), Map-Reduce, Hbase-0.94.8, Storm 0.9.1, Linux Centos 5.x, Agile, SVN, Maven, Jira.
Confidential, NY
Hadoop Developer
Responsibilities:
- Analyzed large data sets by running Hive queries and Pig scripts
- Worked with the Data Science team to gather requirements for various data mining projects
- Involved in creating Hive tables, and loading and analyzing data using hive queries
- Developed Simple to complex MapReduce Jobs using Hive and Pig
- Involved in running Hadoop jobs for processing millions of records of text data
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
- Developed multiple MapReduce jobs in java for data cleaning and preprocessing
- Involved in loading data from LINUX file system to HDFS
- Responsible for managing data from multiple sources
- Extracted files from CouchDB through Sqoop and placed in HDFS and processed
- Experienced in runningHadoopstreaming jobs to process terabytes of xml format data
- Load and transform large sets of structured, semi structured and unstructured data
- Responsible to manage data coming from different sources
- Assisted in exporting analyzed data to relational databases using Sqoop
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts
- Responsible to manage data coming from different sources.
- Involved in loading data from UNIX file system to HDFS.
- Cluster coordination services through Zookeeper.
- Experience in managing and reviewing Hadoop log files.
- Job management using Fair scheduler.
- Involved in loading data from LINUX file system to HDFS.
Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, LINUX, Big Data, Zookeeper, Cloudera Distribution for Hadoop (CDH)
Confidential
J2EE Developer
Responsibilities:
- Implemented J2EE Design Patterns such as Business Delegate, Front Controller, MVC, Session Facade, Value Object, DAO, Service Locator, Singleton.
- Lead Root Cause Analysis activities to successfully identify root causes of incidents.
- Assessment & Estimation of the change to be made in the application
- Implemented the database connectivity with JDBC to the database on DB2.
- Implemented the server side processing using Java Servlets.
- Implemented LDAP and role based application security
- Maintenance and support of the existing application
- Production support done using Remedy Tickets
- Created development and test environment in WebSphere 5.1 and Apache Tomcat 4.1 web server.
- Actively involved in the integration of different use cases, code reviews etc.
- Created development and test environment in WebSphere 5.1.1.
- Client Interaction to get the requirements from the end user.
- Customizing application on J2EE to suit user’s requirements
- Performing Enhancements on the application based on the user’s requirements
- Maintaining the application using J2EE Components like Websphere
- Querying the database using SQL for related results.
Environment: J2EE (Java 1.4, JSP), LDAP, DB2, WSAD 5.1
Confidential
J2EE Developer
Responsibilities:
- Involved in development of business domain concepts into Use Cases, Sequence Diagrams, Class Diagrams, Component Diagrams and Implementation Diagrams.
- Implemented various J2EE Design Patterns such as Model-View-Controller, Data Access Object, Business Delegate and Transfer Object.
- Responsible for analysis and design of the application based on MVC Architecture, using open source Struts Framework.
- Involved in configuring Struts, Tiles and developing the configuration files.
- Developed Struts Action classes and Validation classes using Struts controller component and Struts validation framework.
- Developed and deployed UI layer logics using JSP, XML, JavaScript, HTML /DHTML.
- Used Spring Framework and integrated it with Struts.
- Involved in Configuring web.xml and struts-config.xml according to the struts framework.
- Designed a lightweight model for the product using Inversion of Control principle and implemented it successfully using Spring IOC Container.
- Used transaction interceptor provided by Spring for declarative Transaction Management.
- The dependencies between the classes were managed by Spring using the Dependency Injection to promote loose coupling between them.
- Developed DAO using spring JDBC Template to run performance intensive queries.
- Developed ANT script for auto generation and deployment of the web service.
- Wrote stored procedure and used JAVA APIs to call these procedures.
- Developed various test cases such as unit tests, mock tests, and integration tests using the JUNIT.
- Experience in writing Stored Procedures, Functions and Packages.
- Used log4j to perform logging in the applications.
Environment: Java, J2EE, Struts MVC, Tiles, JDBC, JSP, JavaScript, HTML, Spring IOC, Spring AOP, JAX-WS, Ant, Web sphere Application Server, Oracle, JUNIT and Log4j, Eclipse.