Datastage Developer Resume
AZ
SUMMARY
- Dynamic career reflecting pioneering experience and high performance in System Analysis, design, development and implementation of Relational Database and Data Warehousing Systems using IBM Datastage v 11.7/11.5/8.5 /8.1/7. x/6.x/5.x (Info Sphere Information Server, Web Sphere, Ascential Data Stage). 8 years plus experience in building data intensive applications, tackling challenging architectural and scalability problems. Highly analytical and process - oriented Data Analyst turned Data Engineer wif in-depth knowledge of database types, research methodologies, and data manipulation and visualization and Azure Services. Extensive experience includes:
- Hands on experience on Integration and Migration using IBM Infosphere DataStage (11.7/11.5), Quality stage, SSIS, SAP BOXI, Oracle, SQL Server, Teradata, DB2, SQL and Shell script.
- Experience in design and development of ETL Jobs using Informatica Power Center v 9.6/8.6/7.3.
- Excellent Experience in Designing, Developing, Documenting, Testing of ETL jobs and mappings in Server and Parallel jobs using Data Stage to populate tables in Data Warehouse and Data marts.
- Proficient in developing strategies for Extraction, Transformation and Loading (ETL) mechanism.
- Expert in designing Parallel jobs using various stages like Join, Merge, Lookup, remove duplicates, Filter, Dataset, Lookup file set, Complex flat file, Modify, Aggregator, XML.
- Expert in designing Server jobs using various types of stages like Sequential file, ODBC, Hashed file, Aggregator, Transformer, Sort, Link Practitioner and Link Collector.
- Experienced in integration of various data sources (DB2-UDB, SQL Server,PL/SQL,Oracle, Teradata,XMLand MS-Access) into data staging area.
- Experienced in designing and development of Oracle Application Express data entry forms, interactive reports and Hyper Text Linked Charts, in close deliberation wif the business users.
- Good exposure to job scheduling and load balancing usingSASGridandLSF.
- Expert in working wif Data Stage Manager, Designer, Administrator, and Director.
- Expert usage of Oracle utilities and applications (OUAF) like SQL*Loader, External Tables, export / import etc.
- Expertise in all the phases of System development life Cycle (SDLC) using different methodologies like Agile, Waterfall.
- Experience in analyzing the data generated by the business process, defining the granularity, source to target mapping of the data elements, creating Indexes and Aggregate tables for the data warehouse design and development.
- Extensively worked on BIPM- Master Data Management Framework from BIDS (BI Decision Support)
- Excellent knowledge of studying the data dependencies using metadata stored in the repository and prepared batches for the existing sessions to facilitate scheduling of multiple sessions.
- Proven track record in troubleshooting of Data Stage jobs and addressing production issues like performance tuning and enhancement.
- Experience in different Scheduling tools like AutoSys for automating and scheduling jobs run.
- Extensively used slowly changing dimension Type 2 approach to maintain history in database.
- Expert in working on various operating systems like UNIX AIX 5.2/5.1, Sun Solaris V8.0 and Windows 2000/NT.
- Proficient in writing, implementation and testing of triggers, procedures and functions inPL/SQLandOracle.
- Worked on data cleansing and standardization using the cleanse functions in IBM’s MDM.
- Good Knowledge on Account domain and Party domain in IBM MDM.
- Experienced in Database programming for Data Warehouses (Schemas), proficient in dimensional modeling (Star Schema modeling, and Snowflake modeling).
- Expertise in UNIX shell scripts using K-shell for the automation of processes and scheduling the Data Stage jobs using wrappers.
- Experience in using software configuration management tools like Rational Clear case/Clear quest for version control.
- Experienced in Data Modeling as well as reverse engineering using tools Erwin, Oracle Designerand MS Visio, SQL server management studio, SSIS and SSRS and store procedure.
- Expert in unit testing, system integration testing, implementation and maintenance of databases jobs.
- Effective in cross-functional and global environments to manage multiple tasks and assignments concurrently wif effective communication skills.
- Involved in SAS Scripting for generating TDQ Reports.
- Experience in different Hadoop distributions likeCloudera 5.3(CDH4, CHD 5) andHorton WorksDistributions (HDP).
- Proficient knowledge onApache Sparkand programmingSCALAto analyze large datasets usingSparkandStorm&Kafkato process real time data.
- Involved in building, evolving and reporting framework on top of theHadoop clusterto facilitate data mining, analytics and dash-boarding.
- Support a wide variety of ad hoc data needs.
- Experience wif various scripting languages likeLinux/Unix shell scripts, Python.
- Worked extensively in Data collection, Data Extraction, Data Cleaning, Data Aggregation, Data Mining, Data verification, Data analysis, Reporting, and data warehousing environments.
- Experience in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
- Experience working on Azure Services like Data Lake, Data Lake Analytics, SQL Database, Synapse, Data Bricks, Data factory, Logic Apps and SQL Data warehouse.
- Experience in Developing Spark applications using Pyspark, Scala and Spark-SQL for data extraction, transformation, and aggregation from multiple file formats for analyzing & transforming the data to uncover insights into the customer usage patterns
- Experienced in building modern data warehouses in Azure Cloud, including building reporting in PowerBI or Data Studio.
- Extensively used pyspark data frame API, spark sql, python and pandas udf’s for transformations for achieving the same functionalities from the RDBMS world and rebuild ETL’s wif maximum efficiency wif respect to cost and time
- Researched, analyzed and developed business reports that helped in business forecasting, trend analysis, statistical modeling, and strategy development.
- Strong skills in visualization tools Power BI, Microsoft Excel - formulas, Pivot Tables, Charts and DAX Commands.
- Efficient Multi-Tasker: Worked in real-time environments, and dealt wif huge databases in organizing, documenting, querying, analyzing, and visualizing.
TECHNICAL SKILLS
ETL Tools: IBM Web Sphere Data stage and Quality Stage v 11.7/11.5/8.5/8.1/8.0 , IBM Infosphere Datastage 8.5,11.5,IBM Infosphere DataStage 8.1 (Parallel & Server), Ascential Data Stage 7.5.2/5.1/6.0 Profile Stage 7.0, SSIS (SQL server 2012/2008/2005 ), Data Integrator. Informatica v9.6, 8.6, 7.3
Database: Oracle 11g/10g/9i/ 8i/ 8.0/ 7.x, MS SQL Server (2012, 2008 R2, 2008, 2005), MS Access, MySQL, Teradata V2R6, 13.0. IBM DB2, Netezza, Sybase.
Development Tools & Languages: SQL, C, C++, Unix Shell Scripting,PL/SQL,oracle.
Data Modeling Tools: Erwin 4.0, Sybase Power Developer, SSIS, SSRS, Dimensional Data Modeling, Data Modeling, Star Schema, Snow-Flake Schema.
Operating Systems: HP-UX, IBM-AIX 5.3, Windows 95/98/2000/ NT, Sun Solaris, Red-Hat Linux, MS SQL SERVER 2000/2005/2008 & MS Access.
BI Tools: Business Objects, Brio, SSRS (SQL Server 2012/2008, 2005), SAS 9, SAS SPDS, SAS DI-STUDIO, LSF Scheduler, SAS Management Console, SAS Information MAP Studio, IBM Cognos 8 BI, Oracle Application, OBIEE, CDC, CDD.
PROFESSIONAL EXPERIENCE
Confidential, AZ
Datastage Developer
Responsibilities:
- Involved in gathering Business requirements and Designing Technical specifications and Mapping documents wif Transformation rules.
- Developing jobs using real time stages to fetch data from API and IIB queues.
- Developing Staging, Dimension, Fact and Extract jobs across different modules for Management Information Systems reports.
- Developed Unix Scripts to optimize Execution of DataStage Jobs.
- Experience in job scheduling using Control - M, Zeke.
- Expertise in generating test data and testing performance of complex DataStage jobs.
- Involved in Developing and Debugging Parallel Jobs and Job Sequences.
- Involved in processing stages to implement the ETL logic to achieve optimal solutions.
- Developed Structured Query Language (SQL) to optimize the execution of DataStage Jobs.
- Worked on IBM Infosphere DataStage Upgrade from v11.5 to 11.7.
- Involved in ETL DataStage Development, migration, testing, deployment and production support.
- Designed Documents for performing unit testing of developed code. Tested API data using SOAP UI and Postman.
Environment: IBM Infosphere DataStage and QulalityStage 11.7 (Administrator, Designer, Director), SAS Management Console, SAS Information MAP Studio, MDM, ORACLE 11g/10g/9i, IBM DB2, Mainframes, SQL Server, PUTTY, WinSCP.
Confidential, NJ
Datastage Developer
Responsibilities:
- Involved in primary on-site ETL Development during the analysis, planning, design, development, and implementation stages of projects using IBM Web Sphere software (IBM Infosphere DataStage v 8.5 and 11.5).
- Worked on Datastage Upgrade from v8.5 to 11.5.
- Worked on Converting OSH Scripts to Datastage Jobs according the Datastage v11.5 Environment.
- Created the documentation for the Installs Updates and regular maintenance.
- Extensively worked on tuning of Stored procedures.
- Worked on Automation of ETL Processes using Datasatge Job sequencer, Shell Scripting and PL/SQL Programming.
- Worked on Mainframe related Jobs and updated the Mainframe Datasets to Unix platform as per Business Requirement.
- Design and Developed various jobs using Datastage Parallel Extender stages OCI, Hashed file, Sequential file, Aggregator, Pivot and Sort.
- Designed summary reports wif help of various SAS procedure.
- Involved in Change data capture (CDC) & Change data delivery (CDD) ETL process.
- Maintained Data Warehouse by loading dimensions and facts as part of project also worked for different enhancements in FACT tables.
- Created shell script to run data stage jobs from UNIX and tan schedule this script to run data stage jobs through scheduling tool.
- Developed PL/SQL procedures & functions to support the reports by retrieving the data from the data warehousing application.
- Developed complex jobs using various stages like Lookup, Join, Transformer, Dataset, Row Generator, Column Generator, Datasets, Sequential File, Aggregator and Modify Stages.
- Worked on Data cleansing and standardization using the cleanse functions in Informatica MDM.
- Worked on updating TDQ’s using SAS, as a part of development.
- Data migration from Oracle applications to excel sheet using outbound interfaces.
- Converted complex job designs to different job segments and executed through job sequencer for better performance and easy maintenance.
- Experienced in migration of Change data delivery CDD.
- Documented ETL test plans, test cases, test scripts, and validations based on design specifications for unit testing, system testing, functional testing, prepared test data for testing, error handling and analysis.
- Generated Surrogate ID’s for the dimensions in the fact table for indexed and faster access of data.
- Created hash tables wif referential integrity for faster table look-up and for transforming the data representing valid information.
- Used Cobol Copy books to import the Metadata information from mainframes.
- Used DataStage Director to execute, monitor execution status and log view, also used in scheduling jobs and batches
Environment: IBM InfoSphere DataStage and QualityStage v8.5/11.5, SAS Management Console, SAS Information MAP Studio, MDM, ORACLE 11g/10g/9i, IBM DB2, Sybase, Mainframes, SQL Server.
Confidential, MI
Datastage Developer
Responsibilities:
- Prepared Data Mapping Documents and Design the ETL jobs based on the DMD wif required Tables in the Dev Environment.
- Active participation in decision making and QA meetings and regularly interacted wif the Business Analysts & development team to gain a better understanding of the Business Process, Requirements & Design.
- Involved in Design, Development, and Unit Testing using Informatica Power Center v9.6.
- Worked on tuning mappings, also identified and resolved the performance bottlenecks in various levels like sources targets mappings and the sessions using Informatica.
- Developed stored processes required for SAS Information Delivery portal.
- Used Data Stage as an ETL tool to extract data from sources systems, loaded the data into the ORACLE database.
- Designed and Developed Datastage Jobs to Extract data from heterogeneous sources, applied transform logics to extracted data and Loaded into Data Warehouse Databases.
- Created Data stage jobs using different stages like Transformer, Aggregator, Sort, Join, Merge, Lookup, Data Set, Funnel, Remove Duplicates, Copy, Modify, Filter, Change Data Capture, Change Apply, Sample, Surrogate Key, Column Generator, Row Generator, Etc.
- Extensively worked wif Join, Look up (Normal and Sparse) and Merge stages.
- Helped business partners in there POC’s and give them direction to solve given use case scenario in Infosphere MDM
- Extensively used transformations like Router, Aggregator, Joiner, Expression and lookup, Update strategy, Sequence generator and Stored Procedure in Informatica.
- Extensively worked wif sequential file, dataset, file set and look up file set stages.
- Extensively used Parallel Stages like Row Generator, Column Generator, Head, and Peek for development and de-bugging purposes.
- Used the Data Stage Director and its run-time engine to schedule running the solution, testing and debugging its components, and monitoring the resulting executable versions on ad hoc or scheduled basis.
- Developed complex store procedures using input/output parameters, cursors, views, triggers and complex queries using temp tables and joins.
- ImplementedPL/SQL scriptsin accordance wif the necessaryBusiness rules and procedures.
- Used PL/SQL programming to develop Stored Procedures/Functions and Database triggers.
- Coordinate wif team members and administer all onsite and offshore work packages.
- Performed performance tuning of the jobs by interpreting performance statistics of the jobs developed.
- Worked on trouble shooting MDM Performance Issues.
- Developed Test Plan that included the scope of the release, entrance and exit criteria and overall test strategy. Created detailed Test Cases and Test sets and executed them manually.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python.
- Manage and review Hadoop log files.
- Involved in analysis, design, testing phases and responsible for documenting technical specifications.
- Worked hand-in-hand wif the Architect; enhanced and optimized product Spark code to aggregate, group and run data mining tasks using Spark framework
- Hands on experience in joining raw data wif the data using Pig scripting.
Environment: IBM Web Sphere DataStage 8.5/8.1 Parallel Extender, IBM Infosphere DataStage and QualityStage (11.5), Informatica Power Center v9.6, Web Services, Quality Stage 8.1, (Designer, Director, Manager), Microsoft Visio, IBM AIX 4.2/4.1 IBM DB2 Database, SQL Server, IBM DB2, SAS 9, SAS SPDS, SAS DI-STUDIO, LSF Scheduler, SAS Management Console, SAS Information MAP Studio, MDM, ORACLE 11g/10g/9i, Oracle App Ex, MS-SQL Server 2012/2008/2005 , Query man, Unix, Windows.), 4.7, ECC 6.0, Oracle 10g, Cognos 8.4, Sun Solaris 10, CDC, CDD, Tortoise SVN, UC4, HPQC.
Confidential
ETL DataStage Developer
Responsibilities:
- Extensively used DataStage for extracting, transforming, and loading databases from sources including Oracle, DB2 and Flat files.
- Collaborated wif EDW team in, High Level design documents for extract, transform, validate and load ETL process data dictionaries, Metadata descriptions, file layouts and flow diagrams.
- Collaborated wif EDW team in, Low Level design document for mapping the files from source to target and implementing business logic.
- Generation of Surrogate Keys for the dimensions and fact tables for indexing and faster access of data in Data Warehouse.
- Tuned transformations and jobs for Performance Enhancement.
- Extracted data from flat files and tan transformed according to the requirement and Loaded into target tables using various stages like sequential file, look up, Aggregator, Transformer, Join, RemoveDuplicates, Changecapture data, Sort, Column generators, Funnel and Oracle Enterprise.
- Created Batches (DS job controls) and Sequences to control set of jobs.
- Extensively used DataStage Change Data Capture for DB2 and Oracle files and employed change capture stage in parallel jobs.
- Executed Pre and Post session commands on Source and Target database using Shell scripting.
- Collaborated in design testing using HP Quality Center.
- Extensively worked on Job Sequences to Control the Execution of the job flow using various Activities & Triggers (Conditional and Unconditional) like Job Activity, wait for file, Email Notification, Sequencer, Exception handler activity and Execute Command.
- Collaborated in Extraction of OLAP data from SSAS using SSIS.
- Collaborated wif BI and BO teams to find how reports are affected by a change to the corporate data model.
- Collaborated wif BO teams in designing dashboards and scorecards for Analysis and Tracking of key business metrics and goals.
- Utilized Parallelism through different partition methods to optimize performance in a large database environment.
- Developed DS jobs to populate the data into staging and Data Mart.
- Executed jobs through sequencer for better performance and easy maintenance.
- Performed the Unit testing for jobs developed to ensure that it meets the requirements.
- Developed UNIX shell scripts to automate file manipulation and data loading procedures.
- Collaborated in developing Java Custom Objects to derive the data using Java API.
- Responsible for daily verification that all scripts, downloads, and file copies were executed as planned, troubleshooting any steps that failed, and providing both immediate and long-term problem resolution.
- Provided technical assistance and support to IT analysts and business community.
Environment: IBM Infosphere DataStage and QulalityStage 8.5 (Administrator, Designer, Director), IBM Information Analyzer8.0.1a, Microsoft SQL 2005/2008, IBM DB2 9.1, AIX6.0, Microsoft SQL 2008, Oracle11g, Toad 9.5, MS Access, shell scripts, PUTTY, WinSCP, ERWIN 4.0, HP Quality Center, Tivoli
Confidential
ETL DataStage Developer
Responsibilities:
- Used IBM Datastage Designer to develop jobs for extracting, cleaning, transforming and loading data into data marts/data warehouse.
- Developed several jobs toimprove performanceby reducing runtime using different partitioning techniques.
- Used different stages of Datastage Designer likeLookup, Join, Merge, Funnel, Filter, Copy, Aggregator, and Sort etc.
- Used to read complex flat files from mainframe machine buy usingComplex Flat File Stage.
- Sequential File, Aggregator, ODBC, Transformer, Hashed-File, Oracle OCI,XML, Folder,FTP Plug-inStages were extensively used to develop the server jobs.
- Used the EXPLAIN PLAN statement to determine the execution plan Oracle Database.
- Worked on Complex data coming from Mainframes (EBCIDIC files) and knowledge of Job Control Language (JCL).
- Used Cobol Copy books to import the Metadata information from mainframes.
- Designed Datastage jobs usingQuality Stage stages in 7.5for data cleansing & data standardization Process. ImplementedSurvive stage & Match Stagefor data patterns & data definitions.
- Staged the data coming from various environments in staging area before intoDataMarts.
- Involved in writingTest Plans, Test Scenarios, Test Casesand Test Scripts and performed the Unit, Integration, system testing and User Acceptance Testing.
- Used stage variables for source validations, to capture rejects and usedJob Parametersfor Automation of jobs.
- Performed debugging and unit testing and System Integrated testing of the jobs.
- WroteUNIX shell scriptaccording to the business requirements.
- Wrote customized server/parallel routines according to complexity of the business requirements.
- Created shell scripts to perform validations and run jobs on different instances (DEV, TEST and PROD).
- Created & DeployedSSIS(SQL Server Integration Services) Projects, Schemas and Configured Report Server to generate reports through SSRS SQL Server 2005.
- Used to create ad-hoc reports by MS SQL Server Reporting Services for the business users.
- Used SQL Profiler to monitor the server performance, debug T-SQL and slow running queries.
- Expertise in developing and debugging indexes, stored procedures, functions, triggers, cursors using T-SQL.
- Wrote mapping documents for all the ETL Jobs (interfaces, Data Warehouse and Data Conversion activities).
Environment: IBM Web Sphere Data stage and Quality Stage 7.5, Ascential Datastage7.5/EE (Parallel Extender), SQL Server 2005, Linux, Teradata 12, Oracle10g, Sybase, PL/SQL Toad, UNIX (HP-UX), Cognos 8 BI.
Confidential
ETL DataStage Developer
Responsibilities:
- Analyzed, designed, developed, implemented, and maintained Parallel jobs using IBM infosphere DataStage.
- Involved in design of dimensional data model - Star schema and Snowflake Schema
- Generating DB scripts from Data modeling tool and Creation of physical tables in DB.
- Worked SCDs to populate Type me and Type II slowly changing dimension tables from several operational source files
- Created some routines (Before-After, transform function) used across the project.
- Experienced in PX file stages that include Complex Flat File stage, Dataset stage, Lookup File Stage, Sequential file stage.
- Implemented Shared container for multiple jobs and Local containers for same job as per requirements.
- Adept knowledge and experience in mapping source to target data using IBM DataStage 8.x
- Implemented multi-node declaration using configuration files (APT Config file) for performance enhancement.
- Used DataStage stages namely Hash file, Sequential file, Transformer, Aggregate, Sort, Datasets, Join, Lookup, Change Capture, Funnel, FTP, Peek, Row Generator stages in accomplishing the ETL Coding.
- Debug, test and fix the transformation logic applied in the parallel jobs
- Excessively used DS Director for monitoring Job logs to resolve issues.
- Experienced in using SQL *Loader and import utility in TOAD to populate tables in the data warehouse.
- Involved in performance tuning and optimization of DataStage mappings using features like Pipeline and Partition Parallelism to manage very large volume of data.
- Deployed different partitioning methods like Hash by column, Round Robin, Entire, Modulus, and Range for bulk data loading and for performance boost.
- Repartitioned job flow by determining DataStage PX best available resource consumption.
- Created Universes and reports in Business object Designer.
- Created, implemented, modified, and maintained the business simple to complex reports using Business objects reporting module.
Environment: IBM Info sphere DataStage 8.5, Oracle 11g, Flat files, UNIX, Erwin, TOAD, MS SQL Server database, Mainframe COBOL, XML files, MS Access database.