We provide IT Staff Augmentation Services!

Hadoop Developer Resume Profile

2.00/5 (Submit Your Rating)

SUMMARY:

  • IT professional , Over 8 years' experience in System Analysis, Design and Development and Extensive experience in IBM DataStage v 8.7/8.1/8.0/ 7.x using Components like Administrator, Manager, Designer, and Director.
  • Over Seven years in the fields of data Warehousing, Data Integration, Data Migration using IBM Websphere DataStage, Oracle, PL/SQL, DB2 UDB, SQL Server 2000/2005. SQL procedural language and Shell Scripts
  • Around Seven years of experience in ETL methodologies in all the phases of the Data warehousing life cycle
  • Experience in Data ingestion into HDFS using Hadoop ecosystem SQOOP, FLUME and performing data transformation/analysis using PIG and HIVE.
  • In Depth knowledge in Data Warehousing Business Intelligence concepts with emphasis on ETL and full Life Cycle Development including requirement analysis, design, development, testing and implementation
  • Expertise in all the phases of System development life Cycle SDLC using different methodologies like Agile, Waterfall.
  • Have good grasp of data warehousing fundamentals and have proven ability to implement them. Conversant with ETL processes.
  • Worked with SQL, SQL PLUS, Oracle PL/SQL, Stored Procedures, Table Partitions, Triggers, SQL queries, PL/SQL Packages, and loading data into Data Warehouse/Data Marts
  • Excellent experience in most of the RDBMS including Oracle 10g/9i/8x, SQL Server 7.0/6.5 DB2 8.1/9.0
  • Extensively used DataStage- Designer to design and develop Server and PX jobs to migrate data from transactional systems Sybase, DB2UDB into the Data Warehouse.
  • Extensively used DataStage- Manager to Export/Import DataStage Job components and Import Plug in Table Definitions from DB2UDB, Oracle and Sybase databases.
  • Designed Server jobs, Job Sequencers, Batch Jobs and Parallel jobs. Handled multiple pieces of a Project.
  • Experience in writing UNIX Shell scripts for various purposes like file validation, automation of ETL process and job scheduling using Crontab.
  • Designed Parallel jobs using various stages like join, merge, lookup, remove duplicates, filter, dataset, lookup file set, modify, aggregator, CFF, Transformer, XML and MQ plug in stages.
  • Good Experience in Extraction Transformation and Loading ETL processes using Datastage ETL Tool, Parallel Extender, Metastage, Quality Stage, Profile Stage
  • Developed Server jobs using various types of stages like Sequential file, ODBC, Hashed File, Aggregator, Transformer, Sort, Link Partitioner and Link Collector.
  • Experience in integration of various data sources like Oracle, Teradata, DB2, SQL Server, MS Access and Flat files into the Staging Area. Extensively worked with materialized views and TOAD
  • Proven track record in troubleshooting of DataStage Jobs and addressing production issues such as performance tuning and enhancement
  • Excellent knowledge of studying the data dependencies using Metadata stored in the Repository and preparing batches for the existing sessions to facilitate scheduling of multiple sessions
  • Excellent analytical, problem-solving and communication skills

TECHNICAL Skills :

  • Data Warehousing : IBM DataStage 8.7/8.1/8.0/7.5.3/7.5.2/7.5.1 Designer, Director, manager, Administrator , Parallel extender, Server Edition ,MVS Edition, Quality Stage , ETL, OLAP, OLTP, SQL Plus, Business Glossary, Fast Track, Information Analyzer, Metadata Workbench
  • Business Intelligence : OBIEE, Crystal Reports, Hyperion, Cognos 7.0.
  • Databases :Oracle10g/9i/8i/8.0/7.0,DB2UDB7.2/8.1/9.0,Mainframe,TeradataV2R6 13, MS SQL Server 2005, 2008
  • Tools : SQL Loader, Data Version control, Autosys, Excel, Control M, TOAD, SQL Navigator 6.5
  • Programming : SQL, PL/SQL, C, C , VB, XML, Java, J2EE, DOS, COBOL, UNIX, Korn, Shell Scripting, Perl Scripting, Python
  • Analysis Design : Agile, Rational Unified Process RUP , UML, Waterfall
  • Others : HTML 4.0, MS Excel, MS Office,
  • Environment : Sun Solaris , IBM AIX 5.3/5.2/4.2, MS DOS 6.22, Win 2000, Win NT 4.0,Win XP

PROFESSIONAL EXPERIENCE:

Confidential

Sr. ETL Developer/Hadoop Developer

Responsibilities:

  • Created a process to pull the data from existing applications and land the data on Hadoop.
  • Used sqoop to pull the data from source databases such as Oracle RMS database, DB2 Ecom database.
  • Created the Hive tables on top of the data extracted from Source system.
  • Partitioning the Hive tables depending on the load type.
  • Created the hive tables to show the current snapshot of the source data.
  • Created the Datastage jobs to load data from ECOM database to ODS to Business Intelligence layer.
  • Developed the solution for creating the generic jobs in Datastage to load 300 source tables into current ODS layer Netezza .
  • Created reusable components in datastage to pull data from different source systems into ODS.
  • Developed the jobs to load 29 dimensions and 10 Fact tables related to ECOM into Business intelligence layer.

Environment: Hadoop, MapReduce, HDFS, Hive, Java jdk1.6 , IBM Big Insights, IBM InfoSphere Datastage 8.7, UC4, Shell Scripts,WinXP, UNIX and Netezza, Oracle and SQL Server 2008. PL/SQL.

Confidential

Hadoop Developer

Responsibilities:

  • Created a process to pull the data from existing applications and land the data on Hadoop.
  • Worked in agile environment, involved in sprint planning, grooming and daily standup meetings.
  • Responsible for meeting with application owners for defining/planning of Sqooping the data from source systems.
  • Used sqoop to pull the data from source databases such as Teradata, DB2, and MS SQL server.
  • Created the Hive tables on top of the data extracted from Source system.
  • Created Hive and PIG UDFs using java for data transformations and implement date conversions.
  • Partitioning the Hive tables depending on the load type.
  • Worked with AVRO and Sequential files formats.
  • Created MapReduce programs for data transformations.
  • Responsible for creating PIG scripts for data transformations.
  • Responsible for creating the Datameer links for data Visualization using Datameer.
  • Assisted business in validating and analysis of the data.
  • Created the shell wrapper scripts for the Sqoop, Hive and MapReduce jobs.
  • Deployed and scheduled the tested Sqoop, Hive and Datameer jobs using Autosys.
  • Experienced in managing and reviewing Hadoop log files
  • Created work flows using Oozie.
  • Good understanding of Hadoop architecture and knowledge of NoSQL databases Cassandra and Hbase .

Environment: Hadoop, MapReduce, HDFS, Hive, Java jdk1.6 , Pig, Datameer, UNIX, Shell scripting, Teradata, DB2, MySQL, Autosys, Oozie.

Confidential

Sr. Data Stage Consultant / Hadoop Developer

Hadoop experience:

  • Importing and exporting data into HDFS and Hive using Sqoop
  • Experienced in managing and reviewing Hadoop log files
  • Created components on Hive/Pig for converting Fixed Length Ascii files to hive tables. Load and transform large sets of structured, semi structured and unstructured data
  • Responsible to manage data coming from different sources
  • Supported Map Reduce Programs those are running on the cluster
  • Understanding of Cluster coordination services through Zoo Keeper.
  • Involved in loading data from UNIX file system to HDFS.
  • Good understanding of Installation and configuration of Hive and also Hive UDFs.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map way
  • Automated all the jobs for pulling data from FTP server to load data into Hive tables, using Oozie workflows.

ETL Responsibilities:

  • Involved in requirement gathering, analysis and study of existing systems.
  • Involved in preparing technical design/specifications for data Extraction, Transformation and loading.
  • Lead a team of four developers and participated in daily scrum meetings, created ETL solution for complex business requirements.
  • Written Stored Procedures, Functions and packages to modify, load the data and create extracts.
  • Written Teradata Mload, FastLoad and Fast Export Bteq scripts for loading, modifying and creating exports of data.
  • Extensively used Datastage Designer to develop various jobs to extract, cleanse, transform, integrate and create extract files as needed.
  • Developed complex Teradata SQL which involved many tables and calculated the summary values as necessary.
  • Used Information Analyzer for column analysis and written Data rules for quality check.
  • Used Business Glossary and Fast track for ETL mapping and to link business terms with technical terms and solutions.
  • Also used Metadata Work Bench for impact analysis of existing Data model.
  • Also worked as part time admin, involved in Datastage configuration, ODBC connections creation, Assigning roles to the users, monitoring the system and killing process if needed.
  • General cleanup and maintenance of the Datastage server.
  • Developed a generic shell script to wmqfte files and initiate file transfer between two servers.
  • Scheduled the jobs using Autosys Scheduler, which would trigger ETL jobs and invoke wmqfte shell scripts to initiate file transfer between two servers.
  • Involved in writing Jil Script's to create Autosys Jobs to trigger ETL jobs and Shell Script.
  • Created Technical Specs document for the Datastage Jobs, Developed several Test Plans and Error Logs / Audit Trails were maintained.
  • Implementing performance-tuning techniques along various stages of the ETL process.
  • Following up deployment process of Datastage code migration on different environments
  • Development, test and production with admin team.
  • Co-coordinating with client managers, business architects and data architects for various sign offs on data models, ETL design docs, testing docs, migrations and end user review specs.

We'd love your feedback!