Hadoop Developer Resume
Herndon, VA
PROFESSIONAL SUMMARY:
- 7+ years of IT experience including 2 years experience on Big Data and technologies in its Ecosystem.
- 5 years of experience in developing enterprise web applications using Java related technologies.
- Hands - on experience of Big Data Development using technologies such as HDFS, MapReduce, Pig, Hive, HBase, Sqoop and Flume.
- Experienced in working on ingesting data from RDBMS systems to Hadoop using Sqoop.
- Experienced in using Flume to ship flat files and logs information from various streaming sources to Hadoop.
- Used Oozie extensively on creating work flows for data movement and scheduling.
- Experienced in working on sizing Hadoop cluster based on work loads.
- Experienced in working on various methods and custom tools to validate the ingested data.
- Expertise in working on HIVE data store storage formats (AVRO, Parquet, Sequence files) and performance issues.
- Performed benchmarking of HDFS and Resource manager using TestDFSIO and TeraSort.
- Working experience on ingesting and querying data in and out of HBase.
- Experienced in working on troubleshooting performance issues for YARN and HDFS.
- Experienced in working on YARN resource scheduler to control resource allocation for various groups in organization.
- Very good experience in working on TERADATA data import/export processes and performance tuning techniques.
- Good hands on experience in developing Hadoop applications on Spark using Scala as a functional and object oriented programming.
- Experience in implementing Spark using Scala and Spark SQL for faster analyzing and processing of data.
- Experience in Spark, and in-depth knowledge on Spark-SQL, RDD's, Lazy transformation and actions.
- Very good experience in working on UNIX commands and shell scripting.
- Expertise in using Core Java, Spring Core, Spring JDBC and Spring MVC to build end to end web applications.
- Worked on relational databases such as MySQL, Oracle, Teradata.
- Hands-on experience in Agile and Waterfall Software Development Methodologies.
- Good team player with ability to solve problems, organize and prioritization of tasks and good communication skills
TECHNICAL SKILL:
Hadoop / Big Data: HDFS, MapReduce, YARN, Hive, Hbase, Pig, Sqoop, Flume, Oozie, Tez, Zookeeper, Kafka, Solr, Spark Streaming, Avro, RC, ORC, Parquet and Ambari
NoSQL Databases: HBase
Languages: Java, Scala
Scripting/Query: Shell Scripting, SQL
IDEs: Eclipse, Spring Tool Suite, Intellij IDEA
Web / Application Servers: Apache Tomcat webSphere, weblogic and JBoss EAP
Version Control: GIT, SVN
Database: Oracle 10g/9i/8i, DB2, MySQL, Teradata
Operating Systems: Windows, Linux, UNIX family, MAC OSX
PROFESSIONAL EXPERIENCE:
Hadoop Developer
Confidential, Herndon, VA
Environment: Hortonworks Data Platform 2.3, HDFS, MapReduce, YARN, Pig, Hive, TEZ, Sqoop, Flume, Spark, Zookeeper, Oozie, Ambari, Scala, Spark, Oracle, Hue, Beeline, HBase, Java, Tableau, Linux Shell Scripting, JIRA, MAVEN, SVN, JUnit, MRUnit, Eclipse, Windows and Unix.
Responsibilities:
- Participated in brainstorming sessions with teams to understand the ingestion requirements.
- Worked with data analyst teams to understand the data scrubbing needs for security reasons.
- Designed initial load and incremental load strategies to pull data from RDBMS systems.
- Worked on SQOOP scripts to pull data from RDBMS systems and made them ready for production roll out.
- Participated in brainstorming sessions to understand the log ingestion requirements.
- Participated in design of Flume source/sink module usage pattern for data ingestion.
- Working on Flume to bring the flat files and server logs into data lake.
- Participated in evaluation of storage formats HIVE.
- Extensively worked on HIVE data stores for text, Avro and RC storage formats.
- Worked on populating analytical data stores for data science team.
- Created external tables on top of the flat file which are stored in HDFS using HIVE.
- Created tools using Java for performing balance tests.
- Worked with architects to build efficient OOZIE workflows with coordinators.
- Worked on source system for performance testing while running ingestion process.
- Worked on performance tuning of HIVE queries with partitioning and bucketing process.
- Used Spark SQL to process the huge amount of structured data and Implemented Spark RDD transformations, actions to migrate Map reduce algorithms.
- Used Tableau for data visualization and generating reports.
- Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
Hadoop Developer
Confidential, Kansas City, MO
Environment: Hadoop, Map Reduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Hortonworks, PL/SQL, SQL, Toad 9.6, Windows NT, UNIX Shell Scripting.
Responsibilities:
- Worked on Big Data Hadoop cluster implementation and data integration in developing large-scale system software.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Capturing data from existing databases that provide SQL interfaces using Sqoop.
- Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database system Oracle and vice-versa. Loading data into HDFS.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Handling structured and unstructured data and applying ETL processes.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Sr.Java Developer
Confidential, Dallas, TX
Environment: Java1.5, JDK, Servlets, Applets, JavaScript, JDBC, HTML, Oracle9i, WebSphere
Responsibilities:
- Designed and developed Java applets, to download logs.
- Developed file upload utility using Servlets and html, to upload files into database.
- Designed and developed web based application to upload user’s login information.
- Written Java script extensively, for form and other validations as well as for populating data.
- Used JDBC and SQL to access the database.
- Involved in the development of Stored Procedures and SQL queries to interact with backend.
- Involved in integration and system testing.
- Involved in a various software life cycle activities ranging from requirement analysis to releasing software production servers.
- Rational Clear Case was used for version control.
- Supported manual testing teams, and also involved in fixing and rectified issues identified by testing team.
Java Developer
Confidential
Environment: HTML, CSS, JavaScript, Servlets, JSPs, Spring Core, Spring JDBC. Spring MVC, POJO, SQL, JDBC, CVS, Bugzilla, Version One, JBOSS AS, Eclipse IDE, Windows, Linux.
Responsibilities:
- Worked on presentation layer using Spring MVC framework.
- Have used JSP pages to build dynamic web pages.
- Have used Spring core to wire beans with dependency injection process.
- Developed POJOs and Java beans to implement business logic.
- Managed data to and from the database using JDBC connections with Spring JDBC abstraction.
- Used Spring JDBC to write DAO classes to interact with the database to access account information
- Involved in creation of tables and indexes and wrote complex SQL queries.
- Worked on tuning queries performance based on performance test results.
- Used Git as version control system to manage the progress of the project.
- Used Junit framework for unit testing of the application.
- Handled requirements and worked in an agile process.