Etl Developer Resume
Greenville, SC
PROFESSIONAL SUMMARY:
- 7+ years of IT experience in the Development, Implementation and Testing of Database/Data warehousing applications for Financial industries using Data Extraction, Data Transformation, Data Loading and Data Analysis.
- 6+ years of Data Warehousing experience with Ascential DataStage, Quality Stage, Profile Stage and Audit Stage.
- One year of experience in DataStage Administration.
- Experience in integration of various data sources like XML, Mainframe COBOL Files, Flat Files, Oracle, SQL Server, Teradata, and DB2UDB EEE into Data Warehouse.
- Extensive experience in loading high volume data and Performance tuning.
- Experience with UNIX shell scripting for File validation.
- Experienced in Teradata utilities like FASTLOAD, TPT, MLOAD or TPUMP.
- Experience in 24/7 production support for various projects.
- Experienced with all phases of software development life cycle. Involved in Business Analysis, Design, Development, Implementation and Support of software applications.
- Involved in design and implementation of Data Warehouses fact and dimensional tables (Star Schema) with identification of Measures and Hierarchies.
- Highly adaptive to a team environment and proven ability to work in a fast paced teaming environment with excellent communication skills.
TECHNICAL SKILLS:
ETL Tools:
Information Server 8.5/8.0.1/8.1, Ascential DataStage 7.5/7.1/6.0 XE/5.1, Parallel Extender (Orchestrate), Quality Stage, Information Analyzer (Profile Stage), FastTrack, Audit Stage, Business Glossary and Metadata Workbench.
Databases:
Oracle 10g/9i/8i/8.0/7.0, SQL Server 2000/2005/2008, Teradata V2R5.0, DB2 UDB 9.5/9.1/8.2/8.1 EEE, MS Access 97/2000
Languages:
SQL, Transact SQL (T-SQL),SQL*Plus, PL/SQL, UNIX Shell Scripts, C, C++, HTML, XML, VB 5.0/6.0, .NET
Data Modeling:
Erwin 3.5.1/4.2, Power Designer 6.0/9.5, MS Visio
Other Software:
TOAD 7.3, MS Office, PRO*C, BTEQ, Teradata SQL Assistant 6.1.0, Microsoft Office
Scheduling Tools:
Autosys
Operating Systems:
IBM UNIX AIX5.2, HP UNIX 10.2,Windows 9x/2000/NT/XP, 2003/2008 Windows Server, Solaris 2.8/SunOS5.8, Redhat Linux AS
EDUCATION:
- Master of Science in Electrical Engineering.
EMPLOYMENT HISTORY:
- Confidential,Greenville, SC 02/2010 – Present
Title: ETL Developer
US Food Service is a food distribution company with 250,000 customers and 70 distribution centers across US. This project was about providing a complete data integration solution for US Food Service by building an Enterprise Data Warehouse (EDW) and Reporting solution. EDW initiative helped US Food Service by centralizing its information resources and gaining quick access to key information for decision making.
Responsibilities:
- Design, develop and Implementation of ETL jobs to load internal and external data into data mart.
- Worked closely with business team to gather requirements
- Developed the Source-Target Mapping for each Dimension and Fact tables.
- Develop and modify ETL jobs to meet monthly and quarterly report needs
- Developed ETL processes to load data into fact tables from multiple sources like Files, Oracle, Teradata and SQL Server databases.
- Wrote UNIX scripts for preliminary file check and extracting data from vendors.
- Developed processes to schedule and control ETL production runs using Control – M
- Hands on experience with HP Open view tool for migration requests from Dev to Test and Production
- Developed Visio Process flow diagrams to better understanding for process.
- Performed ETL tuning to improve performance
- Wrote archival scripts for extracted source data
- CVS repository used to track ETL changes
- Created Test cases and performed unit & System testing for ETL jobs
- Work closely with Testing Team to rectify defects and document them in Clear Quest-defect tracker
Environment: IBM InfoSphere Information Server 8.5, HP UNIX, Oracle 9i/10g, C/C++, Erwin4.0, Cognos, Control-m.
- Confidential,Minneapolis,MN 07/2008 – 01/2010
Project: Timeline Reduction Bank Acquisition
Title: ETL Specialist
The current process for analyzing, mapping and transforming acquired bank data is very time consuming. There are several factors contributing to the lengthy process. One key factor is the quantity of manual analysis used to understand the data. This analysis includes multiple communication sessions for discovery and mapping. These are scheduled over several weeks and include the business lines, acquired bank personnel and technology representatives. Another factor is the manual coding required for the analysis of the data and finally, the coding for the transformation of the data.
This initiative (Conversion Initiative # 2), Data Analysis, Quality and ETL Tools, will look at improving the process of data analysis, mapping and transforming data. It will look at using the ETL tools to understand acquired bank’s data more quickly, and at a deeper level. Processes will be streamlined. Redundant analysis coding, mapping documentation, and transformation coding, would be reduced and/or eliminated.
Roles Performed:
- Involved with Business users and ETL Leads from different teams to implement ETL Frame Work using DataStage Server/PX combination of jobs.
- Sourced data from various sources like DB2 UDB, Flat Files and CSV files
- Designed jobs using different parallel job stages such as Join, Merge, Lookup, Remove Duplicates, Filter, Dataset, Lookup File Set, Change Data Capture, Switch, Modify, Aggregator, DB2 Enterprise, and DB2 API.
- Involved in developing DataStage Designer- Server and PX jobs for Extracting, Cleansing, Transforming, and Integrating /Loading Data into Data Warehouse
- Developed User Defined subroutines using Universe BASIC to implement some of the complex transformations, date conversions, code validations and calculations using various DataStage supplied functions and routines.
- Developed Job Sequencers with restart capability for the designed jobs using Job Activity, Exec Command, E-Mail Notification Activities and Triggers.
- Extensively designed, developed and implemented Parallel Extender jobs using Parallel Processing (Pipeline and partition parallelism), Restartability techniques to improve job performance while working with bulk data sources.
- Created projects using DataStage Administrator.
- Changed user group assignments.
- Unlocked the jobs from administrator and director.
- Extensively used DataStage Director to Monitor and check the run statistics of the Jobs.
- Extensively used DataStage Manager to Export/import DataStage components.
- Extensively used SQL tuning techniques to improve the database read performance through DataStage Jobs and used Frame Work approach to improve transformation and loading steps.
- Involved in Unit Testing, System Testing, Integration and Performance Testing of the jobs.
- Involved in the Execution and creation of Test Plans Test scripts and job flow Diagrams
- Worked closely with Data Quality Analysts and Business Users for data accuracy and consistency after table loads..
Environment: IBM InfoSphere Information Server 8.1, DB2 UDB 9.1 Enterprise Edition, Redhat Linux, Autosys 4.5, Connect Direct Putty, Microsoft Visio, Microsoft Project server, Microsoft Portal, Clear Quest,, Microsoft Office(Excel, Word and Power point), Acrobat distiller, Clear Case.
- Confidential,Pleasanton, CA 09/2006 – 07/2008
Title: Sr. ETL Designer/DataStage Developer
Kaiser Permanente is an integrated managed care organization, is the largest health care organization in the United States. The Health Plan and Hospitals operate under state and federal non-profit tax status, while the Medical Groups operate as for-profit partnerships or professional corporations in their respective regions.
The project was to design, develop and maintain a data warehouse for their vendor\'s data, internal reference data and work with their DBA to ensure that the physical build adheres to the model blueprint.
Responsibilities:
- Involved in understanding of business processes and coordinated with business analysts to get specific user requirements.
- Studied the existing data sources with a view to know whether they support the required reporting and generated change data capture request.
- Used Quality Stage to check the data quality of the source system prior to ETL process.
- Worked closely with DBA\'s to develop dimensional model using Erwin and created the physical model using Forward Engineering.
- Worked with DataStage Administrator for creating projects, defining the hierarchy of users and their access.
- Defined granularity, aggregation and partition required at target database.
Involved in creating specifications for ETL processes, finalized requirements and prepared specification document. - Used DataStage as an ETL tool to extract data from sources systems, loaded the data into the SQL Server database.
- Imported table/file definitions into the DataStage repository.
- Performed ETL coding using Hash file, Sequential file, Transformer, Sort, Merge, Aggregator stages compiled, debugged and tested. Extensively used stages available to redesign DataStage jobs for performing the required integration.
- Extensively used DataStage Tools like InfoSphere DataStage Designer, InfoSphere DataStage Director for developing jobs and to view log files for execution errors.
- Controlled jobs execution using sequencer, used notification activity to send email alerts.
- Ensured that the data integration design aligns with the established information standards.
- Used Aggregator stages to sum the key performance indicators used in decision support systems.
Scheduled job runs using DataStage director, and used DataStage director for debugging and testing. - Created shared containers to simplify job design.
- Performed performance tuning of the jobs by interpreting performance statistics of the jobs developed.
- Documented ETL test plans, test cases, test scripts, and validations based on design specifications for unit testing, system testing, functional testing, regression testing, prepared test data for testing, error handling and analysis.
Environment: DataStage 7.5.1 Enterprise Edition, Quality Stage, Flat files, Teradata 6.0,DB2 8.1, SQL Server -2005/2008, Erwin 4.2, PL/SQL, UNIX, Windows NT/XP.
- Confidential,Costa Mesa, CA 07/2005 – 08/2006
Project: Bookspan Marketing Database
Title: Senior DataStage Developer
The project is to develop Bookspan Marketing Database for Client Bookspan, NY. Bookspan Marketing Database allows Bookspan to have a single consolidated view of all customers in Bookspan East and West. Implementation of project involved in Two Stages:
Stage I: Involved consolidation of account data and promotion data from multiple sources, incorporate tying multiple accounts to a single individual using TruVue (Experian’s solution) and provide access to a campaign management tool.
Stage II: Involved incorporate Insource Enhancement and TruVue change notifications, implement on-going feeds (Bookspan East and West delta files), NCOA notifications from merge-purge and provide reports and ad-hoc access via Business Objects.
Roles Performed:
- Extensively used DataStage Designer to develop various jobs to extract, cleanse, transform, integrate and load data into Bookspan Marketing Database (DB2 UDB RDBMS).
- Worked with DataStage Manager to import/export metadata from database, jobs and routines between DataStage projects
- Used DataStage Director to schedule, monitor, cleanup resources and run job with several invocation ids.
- Used DataStage Administrator to control purging of Repository and DataStage client applications or jobs they run, cleanup resources, execute TCL commands, move and manage or publish jobs from development to production status
- Wrote Batch Jobs to automate the System Flow using DS Job Controls with restart-ability to of jobs in a Batch.
- Developed user defined Routines and Transforms by using DataStage Basic language
- Used TCL commands in DS Jobs to automate Key Management of surrogate keys and used DataStage Command Language 7.0
- Developed DataStage server jobs using Sequential File, Hash File, Sort, Aggregator, Transformer, ODBC, Link Collector/Practitioner stages
- Designed several DataStage parallel jobs using Sequential Lookup File set, Join, Merger, Lookup, Change Apply, Change Capture, Remove duplicates, Funnel, Filter and Pivot stages.
- Extensively used Teradata Load and Unload utilities such as Multi Load, Fast Export and Bulk Load stages in Jobs for loading/extracting huge data volumes
- Developed various SQL scripts using Teradata SQL Assistant and used some of them in DS Jobs with BTEQ Utility and some used in Teradata Stages as SQL override
- Involved in Unit testing, System and Integration testing
- Wrote UNIX shell Scripts for file validation and scheduling DataStage jobs
Environment: DataStage 7.1/7.5, Teradata V2R5.0, IBM AIX 4.1.8