Etl Resume
NY
Professional Summary:
- Over 9 yrs of Dynamic career reflecting pioneering experience and high performance in System Analysis, design, development and implementation of Relational Database and Data Warehousing Systems using IBM DataStage 8.1/7.x/6.x/5.x (Information Server, WebSphere, Asecential DataStage).
- Experienced in Designing, Developing, Documenting, Testing of ETL jobs and mappings in Server and Parallel jobs using DataStage to populate tables in Data Warehouse and Data marts.
- Proficient in developing strategies for Extraction, Transformation and Loading (ETL) mechanism.
- Expert in designing Parallel jobs using various stages like Join, Merge, Lookup, Remove duplicates, Filter, Change Cpature, Complex flat file, Modify, Aggregator, XML,.
- Expert in designing and developing backend PL/SQL packages in database layer, functions and triggers.
- Experienced in integration of various data sources (DB2-UDB, SQL Server, Sybase, Oracle, Teradata, XML and MS-Access) into data staging area.
- Expert in working with Data Stage Manager, Designer, Administrator, and Director.
- Expert in using Information Analyzer for Column Analysis, Primary Key Analysis and Foreign Key Analysis.
- Expert in working with Quality Stage for data profiling ,standardization ,matching and survivorship
- Experienced with Microsoft DTS packages and TOAD.
- Excellent knowledge of studying the data dependencies using metadata stored in the repository and prepared batches for the existing sessions to facilitate scheduling of multiple sessions.
- Proven track record in troubleshooting of Data Stage jobs and addressing production issues like performance tuning and enhancement.
- Expert in working on various operating systems like UNIX AIX 5.2/5.1, Sun Solaris V8.0 and Windows.
- Proficient in writing, implementation and testing of triggers, procedures and functions in PL/SQL.
- Experienced in Database programming for Data Warehouses (Schemas), proficient in dimensional modeling (Star Schema modeling, and Snowflake modeling).
- Experienced in Data Modeling as well as reverse engineering using tools ERwin, Oracle Designer and MS Visio.
- Expert in unit testing, system integration testing, implementation and maintenance of databases jobs.
- Effective in cross-functional and global environments to manage multiple tasks and assignments concurrently with effective communication skills.
Education Qualifications:
MS in Computer Engineering
Skillset:
IBM Information Server 8.1( DataStage, Information Analyzer, Quality Stage, MetaStage), Oracle Designer, Erwin , ER/Studio, TOAD 8.6, Oracle 10g/9i/8i/7.x, SQL Server 7.x/2000/2005/2008, SQL *Loader, Crystal Reports, Business Objects, MS Access, IBM AIX 5.2/5.1, Sun Solaris V8.0, HP-UX V11.0, Windows XP/NT/2000/98, C, C++, Autosys, SQL, PL/SQL, Teradata, DB2 , Sybase, Mercury Test Director, Core Java Development, Sybase.
Projects Summary:
Confidential, NY 08/2009-Present
Sr. ETL Developer/Lead
Confidential, is one of largest Insurance Company in tristate area. GHI (Group Health Incorporated) and HIP (Health plan of New York) were merged to Emblem Health which serves 3.4 millions and provides affordable, quality health care solutions. The purpose of the project was to integrate (Amysis) GHI data to Qcare(HIP) to provide Customer Care assistants with single CRM solution and also to build a centralized data ware house by integrating GHI and HIP Databases to support Decision support system.
Hardware/Software:
IBM InfoSphere Information Server DataStage 8.1, Oracle 10g, DB2 v 8.1.7.445, Erwin 7.0, TOAD 9.1, Windows XP, Autosys, XML files, COBOL files UNIX Shell Scripting, SQL*Loader, Sun Solaris 8, IBM InfoSphere Information Analyzer, Mercury TestDirector.
Amysis to Qcare Integration:
ETL Lead
The purpose of the project was to integrate Amysis(GHI) System with Qcare(HIP) System and also to provide data to supporting web based feeds.
Confidential
Sr. ETL Developer
The purpose of the project was to integrate GHI and HIP systems to provide data to new emblem health support system.
Responsibilities:
- Involved in status meetings and interacted with the business user to get the business rules.
- Involved in developing business required documents along with business analyst.
- Involved in Defining Best ETL practice doc, Development standards doc for DataStage ETL Jobs.
- Extracted data from various transactional data sources residing on DB2, Oracle, Complex Flat Files and loaded into ODS.
- Developed incremental extract jobs for daily updates.
- Created ETL jobs for processing Complex Flat Files from Mainframe COBOL files
- Used Information Analyzer for Column Analysis, Primary Key Analysis and Foreign Key Analysis.
- Designed Parallel jobs using various stages like join, merge, Lookup, CDC, remove duplicates, filter, modify, aggregator and funnel stages.
- Designed and developed backend PL/SQL packages in database layer, stored procedures, functions and triggers.
- Wrote triggers and procedures for database security and backup requirements.
- Developed numerous programs using PL/SQL and many packages to do the validations as per requirements.
- Wrote complex SQL queries using joins, sub queries and correlated sub queries.
- Involved in data loading and data migration. Used SQL*Loader to load data from Excel file into temporary table and developed PL/SQL program to load data from temporary table into base Tables.
- Imported and exported Repositories across DataStage projects.
- Involved in the migration of DataStage jobs from development to production environment.
- Extensively used Autosys for automation of scheduling for UNIX shell script jobs on daily, weekly monthly basis with proper dependencies.
- Performed Unit testing and System Integration testing.
- Involved in creating technical documentation for source to target mapping procedures to facilitate better understanding of the process and incorporate changes as and when necessary.
- Worked on troubleshooting, performance tuning and performance monitoring for enhancement of DataStage jobs.
Confidential, Indianapolis, IN 2/2008 -07/2009
Sr. DataStage Developer
Confidential, is one of the largest insurance companies in the United States which provides medical insurance plans to individual/family and companies. The purpose of the project was to create a centralized Data Warehouse by integrating its Policy and Claims databases for providing better support to organizations decision support systems.
Hardware/Software:
IBM Information Server 8.0.1(DataStage, QualityStage, Information Analyzer), Oracle 10g, MS SQL Server 2005, MS Access, Flat files, ERwin 7.1, SQL, PL/SQL, Toad, Tidal scheduler.
Responsibilities:
- Extensively analyzed and designed ETL processes.
- Identified and documented data sources and transformation rules required to populate and maintain data warehouse. Created DataStage parallel jobs to load data from sequential files, flat files and MS SQL Server.
- Extensively used DataStage Designer for creating new job categories, metadata definitions and data elements, import/export of projects, jobs and DataStage components, viewing and editing the contents of the repository as well as writing routines and transforms.
- Fine-tuned application logic for better performance. Developed complex queries, functions, store procedures and triggers using PL/SQL.
- Created stored sub programs with dynamic SQL at both client and server side.
- Wrote complex SQL queries using joins, sub queries and correlated sub queries.
- Created parameter sets to group DataStage and QualityStage job parameters and store default values in files to make sequence jobs and shared containers faster and easier to build.
- Defined the data definitions and created the target tables in the database.
- Penned transformation routines to transform the data according business requirements.
- Mapped the source and target databases by studying the specifications and analyzing the required transforms.
- Tuned stored procedures and developed business logic in database layer using PL/SQL.
- Worked on PL/SQL packages, procedures, objects, varrays, functions, and ref cursors
- Used the DataStage Director and its run-time engine to schedule and run the parallel jobs, testing and debugging its components and monitoring the resulting executable versions on an ad hoc or scheduled basis.
- Experienced in SQL*Loader and dynamic SQL to do data extraction, manipulation and transformation and to load data to warehouse.
- The mappings were unit tested to check for the expected results.
- Documented the purpose of mapping to facilitate better understanding of the process and to incorporate the changes as and when necessary.
- Extensive usage of Toad for analyzing data, writing SQL, PL/SQL scripts performing DDL operations.
Confidential, Madison, NJ 1/2007 - 2/2008
ETL Lead
Confidential, is a pharmaceutical company involved in the research, development, manufacturing and marketing of pharmaceutical products. The aim of the project was to create a new Oracle Data warehouse by integrating data scattered on many systems. This provides marketing department with data to analyze sales and profits of different pharmaceutical products by category.
Hardware/Software:
DataStage 7.5.2(Designer, Director, Manager, Parallel Extender), Oracle 10g, DataStage BASIC, HP UNIX, MS Access, XML files, ERwin 7.0, Windows XP, Crontab.
Responsibilities:
- Interacted with source system users and business users to get business requirements.
- Modeled the Star Schema Data Marts by identifying the Fact and Dimension tables using ERwin Data modeling tool.
- Created DataStage jobs to extract, transform and load data into data marts according to the required provision from various sources like legacy systems, SQL Server 2000, flat files.
- Worked with IBM support to solve critical DataStage issues.
- Used version control tools like Version Management (Clear Case and Quest) and DataStage Version Control.
- Designed and developed backend PL/SQL packages in database layer, stored procedures, functions and triggers.
- Wrote ad-hoc SQL for data analysis.
- Tuned stored procedures and developed business logic in database layer using PL/SQL
- Created Error Files and Log Tables containing data with discrepancies to analyze and re-process the data.
- Used FTP stage to fetch the data and to deliver the data to target Database.
- Imported and exported repositories across projects using DataStage Manager.
- Tested and modified jobs running in production environment, with minimized downtime
- Developed PL/SQL for higher performing activities at database level.
- Interacted with Team Lead and to standardize the jobs according to the business requirement.
- Created shell scripts to invoke DataStage jobs and pre/post processing files and schedule under Crontab.
- Extensively worked on error handling, cleansing of data, creating Hash files and performing lookups for faster access of data.
Confidential, Minneapolis, MN 6/2005 - 12/2006
ETL Consultant
Confidential, is Fortune 500 Company with its headquarters located in Minneapolis, MN. It is the fifth largest retailer in the United States offering its services through a huge chain of retail stores and online services. The aim of the project was to migrate its Enterprise Data Warehouse (EDW) from Oracle 8i to Oracle 10g for better performance, maintenance and also for providing more reliable data to decision makers.
Hardware/Software:
Ascential DataStage 7.5(Designer, Manager, Director), Oracle 10g/8i, MS SQL Server 2005, ERwin4.3, Sun Solaris Unix, Windows NT, Teradata, QualityStage, MetaStage, Business Objects.
Responsibilities:
- Involved in design phase meetings for Business Analysis and Requirements gathering.
- Worked with business functional lead to review and finalize requirements and data profiling analysis.
- Developed conversion and migration strategy in collaboration with functional business lead.
- Designed various Mappings (Source-to-Target) using DataStage to link between different source systems and Data warehouse for loading data into Warehouse.
- Created Hash tables referencing the Surrogate Key tables for quicker and more efficient lookup to Dimensions.
- Extracted data from source systems SQL Server and Sequential files, Flat files.
- Created DataStage jobs, batches and job sequences and tuned them for better performance.
- Extensively used Built-in (Sort, Merge, Oracle, Aggregator, DB2 Stages), Plug-in Stages for extraction, transformation and loading of the data.
- Extensively used parameters in job properties for entering default values while loading.
- Used MetaStage for synchronization and integration of Meta data from various data Warehouse related tools and also used forautomatically gathering process data from operational systems.
- Used QualityStage stages such as investigate, standardize, match and survive for data quality and data profiling issues during the designing
- Used Shared Containers for code reuse and implementing complex business logic.
- Developed test data and conducted performance testing on the developed modules
- Taking the regular backups of the jobs developed using DataStage Manager Export/Import utility.
- Worked on performance tuning and enhancement of DataStage job transformations.
- Participated in weekly status meetings, and conducting internal and external reviews as well as formal walkthroughs among various teams, and documenting the proceedings.
- Involved in creation and maintenance of several custom reports for various clients using Business Objects.
Confidential, Hartford, CT 5/2004 - 6/2005
DataStage Consultant
Confidential, is one of the largest investment and insurance companies in the United States. The Hartford is a leading provider of investment products, life insurance and group and employee benefits; automobile and homeowners products and business insurance. The purpose of the project was to design and implement new data mart for investment services. Data is extracted from various operational systems to provide a single source of integrated and historical data for the purpose of end-user reporting, analysis and to improve client service.
Hardware/Software:
DataStage 7.0/6.1 Parallel Extender, (Designer, Director, Manager), DataStage BASIC, ERwin 4.0, Unix, Windows NT 4.0, Sun Solaris 2.6/2.7, IBM AIX 4.2/4.1, Oracle 8i, SQL Server 2000, IBM DB2, Sybase, PL/SQL, Dbase3 Files, UNIX Sequential Files, MS Access.
Responsibilities:
- Prepared the Dimension models and prototyped the queries.
- Analyzed and designed ETL processes.
- Used DTS to import, export and transform heterogeneous data between data sources.
- Identified/documented data sources and transformation rules required to populate and maintain data warehouse content.
- Involved in developing Data Marts for the user reporting requirements
- Created DataStage Server jobs to extract data from sequential files, flat files, MS Access and SQL Server.
- Used DataStage Manager for importing metadata from repository, new job categories and creating new data elements.
- Used the DataStage Designer to design and develop jobs for extracting, cleansing, transforming, integrating, and loading data into different Data Marts.
- Defined data definitions, and created the target tables in the database.
- Wrote routines to schedule batch jobs to obtain data overnight from various locations.
- Mapped the source and target databases by studying the specifications and analyzing the required transforms.
- Troubleshot jobs using the debugging tool.
- Analyzed the performance of the jobs enhanced their performance.
- Standardized the nomenclature used to define the same data by users from different business units.
- Created job sequences and schedules for automation.
- Used the DataStage Director and its run-time engine to schedule running the solution, testing and debugging its components, and monitoring the resulting executable versions (on an ad hoc or scheduled basis).
- Used DataStage to transform the data to multiple stages, and prepared documentation.
- Created hash tables used for referential integrity and/or otherwise while transforming the data representing valid information.
- Created ETL execution scripts for automating jobs.
- Worked on writing routines to read parameters from hash file at runtime.
- Unit tested the mappings to check for the expected results.
- Documented the source to target mappings to facilitate changes when necessary
- Used various stages in Parallel Extender like Processing Stages, File Stages, Development Debugger and Database Stages.
Confidential, Warren, NJ 2/2002 - 4/2004
DataStage Developer
Confidential, offers a wide range of Checking, Savings, CD and Retirement products - and all come with many free services including online banking, convenient account access, and 24/7 support. The aim of the project was to build an Enterprise Reporting System to support the portfolio management and performance analysis of the credit card business with various reward offerings. The whole application processes were comprised of ETL process for transactional data, Filter process to identify valid transactions, Calculation process for transaction and reward fee calculation, and data integration into the Data Warehousing System.
Hardware/Software:
DataStage5 (Designer, Manager, Director), Windows XP, Oracle 9i/8i, SQL, PL/SQL, ERwin4.0, TOAD, UNIX, MS Access, Sybase, DB2.
Responsibilities:
- Responsible for understanding the requirements, analysis, design, development and implementation into the system.
- Involved in designing of data models using ERwin.
- Developed SQL Scripts to modify the existing database objects and packages.
- Created PL/SQL procedures, triggers and functions for processing portfolio management information.
- Executed complex SQL statements using joins, date functions, inline functions, and sub-queries to generate reports.
- Identified source systems, connectivity, related tables and fields and ensure that data is suitable for mapping.
- Involved in the testing of the forms against checklist, preparing problem reports and fixing the problems to have the correct functionality of all the forms.
- Created jobs in DataStage to extract from heterogeneous data sources like DB2, Sybase, Access, Flat files and Text files to Oracle.
- Scheduled and monitored the jobs in DataStage Director to load data into target database.
- Used server components (DataStage Repository, DataStage Server, and DataStage Package Installer).
- Evaluated the consistency and integrity of the model and repository. Created DataStage scripts for data transformation, conditioning, validation and loading.
- Developed, automated and scheduled load processes using UNIX shell scripting.