Hadoop Developer Resume
Winston Salem, NC
SUMMARY:
- 8+ years of experience in software development lifecycle, including Design, development and Implementation, Testing and Deployment of Data Warehouse as well as with Big Data Processing applications.
- 2+ years of experience in deployment of Hadoop Ecosystems like Map Reduce, Yarn, Sqoop, Flume, Pig, Hive, Hbase, Zookeeper, Oozie, Spark, Storm, Impala and Kafka.
- Performed Extract, Transform and Load (ETL) using Hive on huge volume of data in Data Lake.
- Managed BIG DATA Ingestion, Data Transformation using PIG, HIVE and HBASE.
- Expertise in analyzing the data using HIVE and writing custom UDF's in JAVA for extended HIVE and PIG core functionality.
- Involved in creating Hive managed/external tables, loading with data and writing Hive queries which will run internally in map reduce way.
- Knowledge on Storm, Scala and Spark.
- Knowledge of developing Apache Spark programs using Scala for large - scale data processing and using the in-memory computing capabilities for faster data processing with Spark core and Spark SQL.
- Worked extensively on Pig Latin and Hive Queries to generate daily/weekly/monthly and adhoc reports on huge data sets. Integrated BI tool with Hive tables to visualize reports.
- Experienced in writing MapReduce programs to process semi structured data using Pig and Java.
- Implemented SQOOP for large dataset transfer to Hadoop from Oracle and Teradata database.
- Proficient in SQL using Oracle, DB2, and SQL Server, also have experience with MicroStrategy Reporting BI tool.
- Hands on experience in designing and coding web applications using Core Java.
- Worked closely with data scientists to find solutions to business problems using predictive analytics, pattern matching and data classifications.
- Extensively used ETL methodology for supporting Data Extraction, transformations and loading processing, using Sqoop
- Excellent technical and analytical skills with clear understanding of ETL design and project architecture based on reporting requirements.
- Worked as a Data Analyst, liaison between BA, Development and Test teams, helping them to understand the data.
- Excellent track record of working closely with multiple groups and client interactions.
- Excellent communication skills, strong decision making and organizational skills along with outstanding analytical and problem solving skills to undertake challenging jobs. Able to work well independently and also in a team by helping to troubleshoot technology and business related problems.
TECHNICAL SKILLS:
Big Data: Hadoop/Big Data HDFS, MapReduce, HBase, Pig, Hive, Sqoop, FlumeOozie, Zookeeper, Spark, Storm, Impala and Kafka.
Operating Systems: Windows NT/XP, UNIX, LINUX.
Programming Languages: SQL, PL/SQL, C, C++, Linux Shell scripts.
Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC.
Databases: Oracle 10g, MySQL, DB2, MS-SQL Server, Teradata.
ETL/BI Tools: Informatica, MicroStrategy 9.6
PROFESSIONAL EXPERIENCE:
Confidential, Winston Salem, NC
Hadoop developer
Responsibilities:
- Involved in the Complete Software development life cycle (SDLC) to develop the application.
- Involved in end to end data processing like ingestion, processing, and quality checks and splitting.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase, Hive, MapReduce and Sqoop.
- Involved in importing data from LINUX file system to HDFS.
- Experience in managing and reviewing Hadoop log files.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
- Supported MapReduce Programs those are running on the cluster.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Worked on tuning the performance Pig queries.
- Mentored analyst and test team for writing Hive Queries.
- Used Kafka to collect Website activity and Stream processing.
- Used Flume, Kafka to aggregate log data into HDFS.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Installed and configured Hadoop and Hadoop eco system.
- Installed Oozie workflow engine to run multiple MapReduce jobs.
- Worked with application teams to install operating system, Hadoop updates, patches, Kafka version upgrades as required.
Environment: Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Linux, Java, Oozie, HBase, Kafka, NoSQL, Eclipse, Linux, Oracle, Teradata, Data Lake, EDW.
Confidential, Columbus, GA
Hadoop Data Analyst/Developer
Responsibilities:
- Involved in end to end data processing like ingestion, processing, quality checks and splitting.
- Bringing the data into Big Data Lake using Pig, Sqoop and Hive.
- Written Map Reduce job for Change Data Capture on HBASE.
- Created Hive ORC and External tables.
- Refined terabytes of data from different sources and created hive tables.
- Developed MapReduce jobs for data cleaning and preprocessing.
- Importing and exporting data into HDFS and HIVE from Oracle, Teradata databases using Sqoop.
- Responsible to manage data coming from different sources.
- Monitoring the running MapReduce jobs on the cluster using Oozie.
- Responsible for loading data from UNIX file systems into HDFS.
- Installed and configured Hive and also wrote Hive UDFs.
- Wrote Pig scripts to process unstructured data and create structure data for use with Hive.
- Written the Oozie workflow to coordinate the Hadoop Jobs.
Environment: Scoop, Pig, Hive, Map Reduce, Java, Oozie, Eclipse, Linux, Oracle, Teradata.
Confidential, Charlotte, NC
EDW Data Analyst
Responsibilities:
- Worked as a Data Analyst - Collect the data from the upstream systems then find the data issues/mapping and scenarios to load into the target Data base.
- Worked with business users and business analyst for requirements gathering and business analysis.
- Converted business requirement into high level and low level design.
- Analyzing and understanding the Business User’s requirements and providing them the solutions.
- Design the Data model changesfor the target tables.
- Extract the data from upstream, transform and load the data in target tables using ETL Data Stage tool.
- Coordinate with the technical team and the Business Users.
- Help BA, Development and QA teams to understand the requirements and nature of the data.
- Distributing the work between offshore team members and tracking the development progress.
- Discussing the technical way of development with onsite and offshore team lead.
- Doing the review of code developed at offshore.
Environment: Data Stage, UNIX, Oracle, Teradata, Fitnesse Automation Tool, Teradata BTEQ automation
Confidential, Reston, VA
Data Analyst
Responsibilities:
- Responsible for designing, building, and supporting the components of data warehouse, such as ETL processes, Database, and reporting environments.
- Designs dimensional models with conformed dimensions, following the business’ processes.
- Debug the legacy application and provide the mapping documents to the ETL development team, Test Teams.
- Analyze the data and trends, provide the data pattern to the Data modeler.
- Assist the QA teams in making sure that the code is meeting the requirements and help them in any data research related issues using the SQL queries.
- Creates processes for maintaining or/and capturing metadata.
- Understanding the company’s business processes and applications, including their effect on data and reporting,and applies the knowledge gained in designing data warehouses and reports.
- Identify and document all requirements for both fresh and current data warehouse components and reports by working with end users.
Environment: Data Stage, Unix, DB2, Teradata, MicroStrategy 9i,Windows, Winscp, Toad.
Confidential
Oracle PL/SQL Developer
Responsibilities:
- Review and Analysis of Functional Specifications.
- Interaction with Functional Specification owner on various functional issues during development.
- Development of Technical Specifications.
- Developed Bank Statements Load Interface program
- Developed Validation scripts in PL/SQL for importing the Invoices from the legacy system into Oracle PayablesModule.
- Developed GL Daily conversion rates program as per client requirements.
- Developed and Customization of Reports.
- Prepared test cases for developed components.
- Validation scripts in PL/SQL for importing the Item Category Assignments from the legacy system into Oracle Inventory module.
- Developed Item Outbound Interface as per client requirements.
- Validation scripts in PL/SQL for importing the Invoices from the legacy system into AP module.
- Registration of concurrent programs for developed reports and Interfaces.
- Creation of request sets for developed interface.
- Building Technical Design documents.
- Developed interface to load data from legacy systems to oracle 11i system.
- Registration of concurrent programs for developed reports and Interfaces.
Environment: Oracle 10g, SQL Developer, UNIX, Windows, Winscp.