We provide IT Staff Augmentation Services!

Lead Datastage Developer Resume

4.00/5 (Submit Your Rating)

Houston, TX

SUMMARY

  • 6 years of IT experience in the software development life cycle (SDLC) starting from collecting business specifications, user requirements, confirming the design decisions regarding data, process, interfaces, reviewing/audit the code and documenting, Implementation and Testing of ETL applications for Retail, financial, Healthcare and Education industries using Data Extraction, Data Transformation, Data Loading and Data Analysis.
  • Extensive knowledge of Data Modeling / Architecture, Database Administration with specialization in Various ETL Platforms (Datasatge, Informatica)
  • Experience in populating and maintaining Data warehouses and Datamarts using IBM Information Server 9.1, v8.7,8,5, Ascential DataStage v7.5.x/7.x/6.x (Administrator, Director, Manager, Designer) - Server Edition, Enterprise Edition (Parallel Extender),IBM Infosphere information Analyzer, IBM infosphere Metadata Workbench, Infosphere Business Glossary and IBM infosphere Quality Stage.
  • Experience in writing complex SQL queries involving multiple tables inner and outer joins.
  • Strong understanding of the principles of Data Warehouse using Fact Tables, Dimension Tables and Data modeling techniques like Data Modeling- Dimensional/ Star Schema, and Snowflake modeling, Slowly Changing Dimensions (SCD Type 1, Type 2, and Type 3).
  • Experience in scheduling Data Stage jobs using AutoSys, Crontab, Control M, and DataStage Director
  • Extensively used Data Stage Director for executing, monitoring, analyzing logs, viewing job score for performance improvement and scheduling the jobs.
  • Extensively worked on Quality Stage for data investigation, standardization, enrichment, and probabilistic matching
  • Experience with Hadoop, HDFS, HBASE, Hive, pig concepts
  • Experience with the design of large-scale ETL solutions integrating multiple source systems DB2, SQL Server, Oracle and Teradata.
  • Proven track record in troubleshooting of DataStage jobs and addressing production issues like performance tuning and enhancement.
  • Excellent understanding and best practice of Data Warehousing Concepts involved in Full Development life cycle of Data Warehousing. Expertise in enhancements/bug fixes, performance tuning, troubleshooting, impact analysis and research skills.
  • Experience on Production job scheduling on UNIX through AUTOSYS scheduling tool, Control M and also using DataStage Job Sequence.
  • Extensive knowledge in Pipeline/Partition concepts and techniques - Round Robin, Entire, Hash by Field, Modulus and Range.
  • Experience in Parallel Extender (DS-PX) developing Parallel jobs using various stages including Data set, Aggregator, Join, Transformer, Sort, Merge, Filter, Funnel, Lookup, Hash, FTP, Difference, Change Capture, Change Apply, Pivot etc.
  • Strong Skills in Coding (Stage Variables, Constraints Routines & Derivations) based on Business logics in Transformation Stage.
  • Strong experience in UNIX Shell and Perl scripting as part of file manipulation, Scheduling and text processing
  • Excellent communication, interpersonal, analytical skills and strong ability to understand long term project development issues at all levels, with analytical and problem solving skills.

TECHNICAL SKILLS

ETL Tools: Data stage, Informatica

Programming Languages: Python, R, MATLAB, C, SQL

Web technologies: JavaScript, HTML, CSS

Relational Databases: MySQL, SQL Server, Oracle

Non-relational Database: MongoDB

Cloud Technologies: AWS, Microsoft Azure

Operating System: Linux, Ubuntu, Windows

Version Control: Git, GitHub

Visualization: Tableau, D3.js

Containers: Docker

PROFESSIONAL EXPERIENCE

Confidential, Houston, TX

Lead DataStage Developer

Responsibilities:

  • Lead the task of Co-ordinating with the business team to understand the requirements like Source, target, ETL requirement etc and try to give the best ETA possible based on bandwidth.
  • Work with the development team and quality team to ensure delivery on or before ETA. (Includes doing the task myself, assigning tasks and providing KT when needed).
  • Design, Develop and implement processes and controls for incremental loads using IBM Datastage 11.5 and 11.7.
  • Extracted data from disparate source systems such as Oracle, Hive, Snowflake and Files (CSV).
  • Apply transformations on the extracted data as per the business requirement and add additional fields to provide the valuation and load timestamps.
  • Used multiple stages like Transformer, oracle connector, snowflake connector, join stage, look-up stage, etc for loading and transformations.
  • Involved in writing the scripts for invoking the Sequence Job controls and loading the files to push to output environment.
  • Designed Autosys Jobs to trigger the Controls and automate the process.
  • Experience loading JSON data in to Snowflake by using copy statement for bulk load.
  • Experience working with SnowPipe for pulling data in to snowflake stating area from different sources.
  • Have built controls for the process to make sure data flows in as expected throughout the stages (landing, staging and mart).
  • Currently working on migrating to Snowflake from Oracle. Designed jobs to load data to Snowflake by extracting from Oracle DB.
  • Importing data stored in Amazon Web Services (S3) into HDFS.
  • Involved in file movements between HDFS and AWS S3 and extensively worked with S3 bucket in AWS.
  • Have developed scala code for data Transformation.
  • Co-ordinated with off-shore to provide updates and communicate the requirements that needs to be accomplished.
  • Experience working with Hive queries and spark/scala for storing data in HIVE from source.
  • Involved in migrating the code to production and also draft the run book for the developed process and coordinate with Support team to run the process successfully.
  • Followed the agile methodology and attended scrum meetings on daily basis.

Environment: IBM Information server 11.5 and 11.7(Administrator, Designer, Director), AWS, Snowflake, Autosys, SQL Client, Oracle, Visual Studio, UNIX, ALM Defect Management tool, MSTR and Power BI Reporting tools.

Confidential, Cincinnati, Ohio

Datastage Developer

Responsibilities:

  • Involved in Defining the Best practice and Development standards doc for Datastage Jobs.
  • Involved in gathering and analysis of business requirements from clients and end users
  • Involved in design and development of server jobs, sequences using the Designer
  • Involved in Defining and designing the process for Extraction, Transformation and Loading (ETL) Data from various source systems to Data warehouse.
  • Playing a proactive role Creating and using Datastage Shared Containers, Local Containers for DS jobs and retrieving Error log information.
  • Guiding and helping other Team Members in the Development Process.
  • Deploying solutions that maximize consistency and usability of the data.
  • Designed the simple and complex data flow for incremental loads of different ETL interfaces.
  • Created Standard Rule Sets for accommodating various Country Codes, Addresses and Currencies
  • Extensively used the Slowly Changing Dimensions Stage for loading Dimension tables and Fact tables.
  • Developed data extractions, transformations and routines.
  • Used Teradata utilities Fast export, fast load, and multi load in Datastage
  • Created Parameter sets, Unix Shell Scripts and Routines to read the Parameter Files from the database tables and passing these values to the job during runtime.

Environment: IBM Information server 8.1, 8.5,8.7 and 9.1(Administrator, Designer, Director), Oracle11g, Python, SQL Server, SQL Client, DB2, UNIX, Windows 7, CA7,Putty,TFS, Power BI

Confidential, Las Vegas, NV

DataStage Developer

Responsibilities:

  • One of my key responsibilities is to understand the requirements from different departments and solve them in most efficient way possible.
  • Designing the schema and created DDL Scripts for ETL Metadata tables to store the run time metrics of the Datastage Jobs.
  • Configured Talend Jobs to load VSAM files to Hive and Snowflake Databases.
  • Working with different internal Marketing Teams and understanding the complex marketing techniques and providing robust solutions.
  • Converting the existing Teradata Stored Procedures to Datastage Jobs using the Standard Framework.
  • Performing Code Reviews and Tuning of the implemented Jobs using Resource Estimation Wizard.
  • Organizing Technical discussions to help solve some of the existing cumbersome use cases and finding an optimal solution.
  • Pitching in new approaches with new technology in mind was a challenging task.
  • Discussing with Data Modelers and Business Analysts to understand the business logic of the EDW Mart Build (An internal Confidential Process).
  • Working with outside vendors (BCG, Bain, TravelClick, Prolifics, etc.,) to make sure the outbound and inbound data gets delivered and loaded in a most efficient way.
  • Creating and enhancing the Migration checklist for the Datastage upgrade to make sure the transition happens smoothly.
  • Experience with writing Pig scripts to analyze & process large datasets and run scripts on Hadoop cluster.
  • Involved in converting Hive/SQL queries into Scala/Spark transformations using Spark RDD's.
  • Written Hive queries for data analysis to meet the business requirements.
  • Experience in running Hadoop streaming jobs to process large amount of data comprised in different formats.

Environment: IBM Information server 9.1 and 11.5, 11.7(Administrator, Designer, Director), Talend, Python, SQL Client, Teradata, Hadoop, Autosys, UNIX, Windows 7.

We'd love your feedback!