Lead - Data Integration Etl Resume
Los Angeles, CA
PROFESSIONAL SUMMARY
- 10+ years of IT experience in Business Requirements Analysis, Application Design, Database design, Data Modeling in Data Warehousing, Development, Data Migration, Testing, Implementation, SQL Performance Tuning and ETL process for Retail, Media, Insurance and Agriculture & Forestry Industries.
- Experience working on Cloud Based Technologies - AWS, Python and ASCEND Tool.
- Experience in Design, Development and implementation of Data warehousing using Informatica ETL tools with Redshift, Netezza, Oracle, DB2, SQL server, Teradata, Matrix, Vector databases on windows and UNIX platforms.
- Experienced working with Informatica Big Data Edition with Hadoop - Horton works.
- Experienced working with SSIS to extract the data from OData source and load into SQL Server.
- Experience in working with Code migration, Data migration, Data Integration and Data Warehousing using ETL tool Informatica PowerCenter9.x/8.x (Source Analyzer, Mapping/Mapplet Designer, Sessions/tasks, Worklets/Workflow Manager).
- Expert-level mastery in designing and developing complex mappings to extract data from diverse sources including Flat files, XML, RDBMS tables and legacy system files.
- Experienced working with Big Volume data and worked on projects - Informatica with Hadoop and AWS (Amazon Web Services).
- Proficient in implementing Complex business rules by creating Mappings/Mapplets and workflows/worklets and experienced in working with SCD (Slowly Changing Dimensions), CDC (Change Data Capture), Data Mart, OLAP and OLTP.
- Strong expertise in Database systems like Redshift, Netezza, Oracle, SQL Server, DB2 design and Database development using DB-Visualizer, Netezza-Aginity, TOAD and SQL PLUS, SQL-LOADER. Implemented the Netezza Migration from Twin-fin12 Database to Striper.
- Involved different phases of SDLC from requirement, design, development, testing (Informatica Data Validation Option-DVO) and rollout to the field user and support for production environment.
- Thorough knowledge of Database Modeling, Management concepts like conceptual, logical and physical data modeling in Data Warehousing applications.
- Experienced in the use of Agile approaches, including sprint planning, daily stand-up meetings, reviews, retrospectives, release planning, demos, Test-Driven Development and Scrum.
- Experienced in understanding the business requirement/process, converting the business specification to technical document, developing and integrating solutions to meet the requirement as per Industry standards.
TECHNICAL SKILLS:
ETL: Informatica Power center 10.1.1, 9.6, 9.1, 8.x, 7.x, Informatica MDM, Informatica DVO, Informatica IDQ 9.6x, Informatica Data Director
BI Tools: Qlikview, Tableau, Business Objects 6.5/XI R2
Big Data Ecosystems: Hadoop - Horton Works Distribution, HDFS, Hive, Pig, Sqoop, Oozie, Spring XD
Operating Systems: Windows 95/98/2000/2003 Server/NT Server Workstation 4.0, UNIX
Scripting Languages: Python, Java, Shell Scripting, Windows Batch Scripting, EJB, JMS/MQ, R, PIG, SQL, PL/SQL, HTML, XML, Mainframes - REXX, COBOL, JCL, TSO, ENDEVOR
Other Tools: DB-Visualizer, Cloudberry, PyCharm, Atom, Bit Bucket, GitHub, Eclipse, Autosys
Methodologies: Agile, E-R Modeling, Star Schema, Snowflake Schema
Data Modeling Tool: Erwin, Sybase - Power Designer
Databases: Redshift, Netezza (Twinfin-6-12, Striper), Oracle 11g/10g/9i/8x, MS SQL Server 2008
PROFESSIONAL EXPERIENCE:
Confidential, Los Angeles, CA
Lead - Data Integration ETL
Responsibilities:
- Analyzing the source data coming from different sources and working with business users and developers to design the Data Integration Model/Patterns.
- Data integration for Freewheel Monetization Rights Management (MRM) and Revenue & Payments Management (RPM) Log and analytics Data feed into Data Lake using AWS Glue and Python.
- Data integration for Google Double click for publisher (DFP) API Impressions into Data Lake and Redshift using Java and Python API.
- Data integration for Sports Illustrated (SI) Email feed using Python to AWS Redshift.
- Data Integration for Neustar (Data Song), Adobe, Nat-Geo, MOAT and Warren Lamb data feeds to AWS Redshift.
- Data integration for Adobe Omniture analytics Impressions into Data Lake and AWS Redshift using AWS Glue and Python
- Involved in building the ETL architecture and Source to Confidential systems to load data into Data warehouse.
- Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings in the Informatica Designer.
- Developed complex Informatica Mappings, Mapplets and Reusable Transformations for different types of studies for Daily and monthly Loading of Data using Change Data Capture (CDC).
- Involved in Dimensional modeling (Star and Snowflake schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts.
- Wrote UNIX scripts to call the Nielsen API using NDS Utility and File/Data processing manipulations.
- Created Table pairs and executed tests using Informatica Data Validation Option tool.
- Used Eclipse and RAD (Rational Application Developer) tools to code the Java programs.
- Tuned Source System and Confidential System based on performance details, when source and Confidential were optimized, sessions were run again to determine the impact of changes.
- Performed Source data systems analysis and created logical and physical data modeling for a Data Warehouse.
- Created, documented and maintained logical and physical database models in compliance with enterprise standards and maintained corporate metadata definitions for enterprise data stores within a metadata repository.
- Formulated a comprehensive data migration plans with different conversion strategies, detailed object and field mappings including its transformations and business rules.
- Involved in Performance tuning at source, Confidential, mappings, sessions, and system levels.
- Experienced working with SSIS to extract the data from OData source and load into SQL Server.
- Created Ingestion procedure using Python for Nielsen AMRLD, Nielsen MIT, Nielsen MRI FUSION, COMSCORE, CROSSIX, IQS, NBI, NCS, SMI and Palantir feeds to Data Lake.
- Wrote Python scripts to create the DDLs - Audit control table and Job Scheduling based on user inputs.
- Wrote Python scripts to replicate the table data between environments - DEV, QA and PROD.
- Developed Windows Batch Scripts to extract the data from Sybase tables.
- Developing SSIS packages to fetch data from various sources (OData Share point), staging it, perform calculation and deployment.
- Experienced working with deploying SSIS packages in remote server and worked with SQL Agent to schedule the jobs.
- Worked on ASCEND tool using Google Big Query and AWS.
- Worked on Hive, Sqoop, Pig and Python as middle layers between Informatica and Database.
- Expertise in Data Profiling using tools Data Watch Monarch etc.
- Created Dashboards and Visualizations in Qlikview, Qlik Sense and Tableau for NEO and Nielsen Projects.
- Documented ETL test plans, test cases, test scripts, test procedures, assumptions, and validations based on design specifications for unit testing, system testing, expected results, preparing test data and loading for testing, error handling and analysis.
- Prepared migration documents to move the mappings from development to testing and then to production repositories.
Environment: Informatica Power Center 10.1.1/9.6, Informatica DVO, Datawatch Monarch, Python 2.7.5 and 3.4.8, AWS, Redshift, Netezza, CA-Workstation, Netezza Administrator, SQL Server, Matrix, Vector, ESP, REXX, UNIX, ERWIN, FileZilla, Putty, Hive, Big Query, WINSCP, Autosys, Visual Studio, Attunity-MFT, Attunity DB Replicator, GitHub, Bit Bucket, DB-Visualizer.
Confidential, Moline, IL
Lead Informatica Developer
Responsibilities:
- Collaborated with Business analysts for requirements gathering, business analysis and designing of the Enterprise Data warehouse
- Worked on CDC (Change Data Capture) to implement SCD (Slowly Changing Dimensions).
- Involved in unit testing, user acceptance testing to check whether data is loading into Confidential which was extracted from different source systems.
- Used transformations like Joiner, Expression, Connected and Unconnected lookups, Filter, Aggregator, Store Procedure, Rank, Update Strategy, Router and Sequence generator.
- Using Workflow Manager for Workflow and Session Management, database connection management and Scheduling of jobs to be run in the batch process.
- Developed complex Informatica Mappings, Mapplets and Reusable Transformations for different types of studies for Daily and monthly Loading of Data using Change Data Capture (CDC).
- Designed and customized data models for Data warehouse supporting data from multiple sources on real time.
- Created mapping documents to outline data flow from sources to targets.
- Developed multiple logical and physical data models for the parolee database and the staging environments to support migration.
- Experienced working on Profiling, Cleansing and Auditing in migration.
- Performed match/merge and ran match rules to check the effectiveness of MDM process on data.
- Used Hierarchies tool for configuring entity base objects, entity types, relationship base objects, relationship types and profiles.
- Worked with Services Integration Framework (SIF), Web services (SOAP/REST), EJB modules and JMS/MQ.
- Used SIF various services such as - ExecuteBatchGroup, CleanTable and CleansePutRequest to access the master data from the downstream applications.
- Used Informatica Data Quality 8.6 (IDQ) tool kit, Analysis, data cleansing, data matching, data conversion, exception handling, and reporting and monitoring capabilities of IDQ 8.6.
- Profile source data usingIDQtool understand source system data representation formats data gaps.
- Created Exception handling process and Used Informatica data directorIDDfor viewing the error tables and all the data manipulations accept reject records based on the requirements.
- Integrated Informatica Data Quality IDQ with Informatica Power Center and Created data quality mappings in Informatica Data Quality tool and imported them into Informatica power center as mappings and Mapplets.
- Worked with projects using Informatica BDE with Hadoop and Amazon Web Services (S3, Elastic Search, and Kibana) Systems.
- Worked on projects using Hive, Sqoop, Pig and Python as middle layers between Informatica and Netezza Analytics Database.
- Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
- Worked on Teradata by bringing the COBOL Files and loading in to the Teradata Tables by using the FLoad.
- Knowledge of loaders in Teradata including Fast load, MLoad, Tpump, BTEQ.
- Wrote UNIX scripts to merge and process Batch files for net-change files.
- Used Python script to pull the records from AWS - S3 and to get the latest version from SVN repository.
- Created Table pairs and executed tests using Informatica Data Validation tool.
- Extensively worked with ESP to schedule Informatica and Hadoop jobs using REXX and JCLs.
- Experienced working with Netezza servers - Twinfin-6, Twinfin-12 and Striper.
- Worked with big volume data (TBs/Billions) using Netezza features - Distribution, Organize, nz migrate, Groom and Statistics.
- Implemented Netezza migration from Twinfin-12 Database server to Striper server.
- Experience with code migration from between repositories and folders.
- Experienced working as Informatica Administrator role to perform code migration, deployment groups and server management.
Environment: Informatica Power Center 9.6/9.1, Informatica MDM 9.7.1, Informatica BDE, Informatica IDD and IDQ 9.6x, Informatica DVO, Netezza, Aginity workbench, Netezza Administrator, Oracle, SIF, Web Services, Teradata, ERWIN, ER-Studio, Salesforce, FileZilla, Sybase - Power Designer, Hadoop, AWS (S3, Elastic Search and Kibana), Web-logic, Putty, Message Queues, WINSCP, Soap-UI, WebLogic, Python, TOAD, JCL, Flat Files, REXX, UNIX, ESP, DB2 and Mainframes.
Confidential
Lead Informatica Developer
Responsibilities:
- Responsible for Business Analysis and Requirements Collection.
- Worked on Informatica Power Center tools- Designer, Repository Manager, Workflow Manager, and Workflow Monitor.
- Parsed high-level design specification into simple ETL coding and mapping standards.
- Designed and customized data models for Data warehouse supporting data from multiple sources on real time.
- Involved in building the ETL architecture, Change Data Capture (CDC) and Source to Confidential mapping to load data into Data warehouse.
- Created mapping documents to outline data flow from sources to targets.
- Involved in Dimensional modeling (Star Schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts.
- Extracted the data from the flat files and other RDBMS databases into staging area and populated onto Data warehouse.
- Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings in the Informatica Designer.
- Configured Landing, Staging and Loading processes, Trust and Validation rules, Match and Merge process using Informatica MDM.
- Performed match/merge and ran match rules to check the effectiveness of MDM process on data.
- Used Hierarchy Manager for configuring entity base objects, entity types, relationship base objects, relationship types and profiles.
- Experienced working with Services Integration Framework (SIF), EJB modules and Web services.
- Worked on Real Time Integration between MDM Hub and External Applications using Power Center and SIF API for JMS.
- Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing and data manipulation.
- Involved in Data Modeling, System/Data Analysis, Design and Development for both OLTP and Data warehousing environments.
- Worked on the Data modeling (Dimensional & Relational) concepts like Physical, Logical,
- Worked with big volume data using Netezza features - Distribution, Organize, nz migrate, Groom and Statistics.
- Used transformations like Joiner, Expression, Connected and Unconnected lookups, Filter, Aggregator, Store Procedure, Rank, Update Strategy, Router and Sequence generator.
- Using Workflow Manager for Workflow and Session Management, database connection management and Scheduling of jobs to be run in the batch process.
- Developed number of complex Informatica Mappings, Mapplets and Reusable Transformations for different types of studies for Daily and monthly Loading of Data.
- Used stored procedures drop and create indexes before and after loading data into the targets.
- Removed bottlenecks at source level, transformation level, and Confidential level for the optimum usage of sources, transformations and Confidential loads.
- Wrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and backup of repository and folder.
- Involved in code migration from between repositories and folders.
- Prepared migration document to move the mappings from development to testing and then to production repositories.
- Tuned Source System and Confidential System based on performance details, when source and Confidential were optimized, sessions were run again to determine the impact of changes.
- Interfacing with and supporting QA/UAT groups to validate functionality.
- Extensively used Eclipse tool to write the Java programs.
- Created Single Table pairs and executed test cases using Informatica Data Validation tool.
- Used Power Center sources, SQL views, join views and lookup views in Informatica Data Validation tool to test the scenarios.
- Working on generating various dashboards in Tableau Server using different data sources such as Oracle, Netezza, DB2 and Created report schedules, data connections, projects and groups.
- Worked closely with business power users to create reports/dashboards using tableau desktop.
Environment: Informatica Power Center 9.1, Informatica MDM 9.5, Informatica IDD and IDQ, Informatica DVO, Netezza, Aginity Workbench, Netezza Administrator, Oracle, SIF, Web Services, ERWIN, ER-Studio, Putty, ALTOVA XMLSPY, WINSCP, FileZilla, JBoss, Python, TOAD, Flat Files, UNIX, REXX, JCL, ESP, DB2, Tableau, and Mainframes.
Confidential .
Informatica Developer
Responsibilities:
- Analyzing the source data coming from different sources and working with business users and developers to develop the Model.
- Translated requirements into business rules & made recommendations for innovative IT solution.
- Used transformations like Joiner, Expression, Connected and Unconnected lookups, Filter, Aggregator, Store Procedure, Rank, Update Strategy, Router and Sequence generator.
- Identifying and tracking the slowly changing dimensions, heterogeneous sources and determining the hierarchies in dimensions.
- Using Workflow Manager for Workflow and Session Management, database connection management and Scheduling of jobs to be run in the batch process.
- Developed number of complex Informatica Mappings, Mapplets and Reusable Transformations for different types of studies for Daily, Monthly, and Yearly Loading of Data.
- Worked extensively on Erwin and ER Studio in several projects in both OLAP and OLTP applications.
- Involved in various projects related to Data Modeling (Physical and Logical), System/Data Analysis, Design and Development for both OLTP and Data warehousing environments.
- Used stored procedures to create a standard Time dimension, drop and create indexes before and after loading data into the targets.
- Removed bottlenecks at source level, transformation level, and Confidential level for the optimum usage of sources, transformations and Confidential loads.
- Created Mappings, Mapplets and Transformations, which remove any duplicate records in source.
- Extensively worked in the performance tuning of the programs, ETL Procedures and processes.
- Written documentation to describe program development, logic, coding, testing, changes and corrections.
- Prepared ETL mapping Documents for every mapping and Data Migration document for smooth transfer of project from development to testing environment (DVO) and then to production environment.
- System Support - Assisted Client for User Acceptance Test for new releases
- Tuned Source System and Confidential System based on performance details, when source and Confidential were optimized, sessions were run again to determine the impact of changes.
- Used Type 1 SCD and Type 2 SCD mappings to update slowly Changing Dimension Tables.
- Used Debugger to test the mappings and fixed the bugs.
- Involved in Performance tuning at source, Confidential, mappings, sessions, and system levels.
Environment: Informatica Power Center 8.6, Power Center Designer, workflow manager, workflow monitor, Informatica DVO, Oracle 11g, Oracle, Netezza, SQL Server 2008, ERWIN, Sybase - Power Designer, Toad, SQL-Loader, XMLs, Mainframe flat files, Shell Scripting, Control-M, UNIX (AIX 5.3), Z/OS, COBOL, JCL, and DB2.
Confidential .
Informatica Developer
Responsibilities:
- Understanding of the business requirements and enhancing the existing data warehouse architecture design for a better performance.
- Participated in designing the data model using Erwin.
- Documented user requirements, translated requirements into system solutions and develop implementation plan and schedule.
- Identified and tracked the slowly changing dimensions, heterogeneous Sources and determined the hierarchies in dimensions.
- Created users and user groups with appropriate privileges and permissions, folders and folder permissions in Repository manager.
- Created CDC (Change Data Captures) mappings to extract net-change data using mapping variables.
- Extensively used Expression, Aggregator, Lookup & Update Strategy transformations.
- Developed shell scripts to monitor and maintain all Informatica Files.
- Created and Monitored Batches and Sessions using Informatica Power Center Server.
- Managed the Metadata associated with the ETL processes used to populate the data warehouse.
- Responsible to tune ETL procedures and STAR schemas to optimize load and query Performance.
- Written documentation to describe program development, logic, coding, testing, changes and corrections.
- Migration of mappings, sessions and workflows from Development to Stage and Production environments.
- Efficient Documentation (Data Mapping, Migration document) was done for all mappings and workflows.
- Involved in Unit and Integrating testing of Informatica Sessions, Batches and the Confidential Data.
- Prepared ETL mapping Documents for every mapping and Data Migration document for smooth transfer of project from development to testing environment and then to production environment.
- Preparing and using test data/cases to verify accuracy and completeness of ETL process.
- Testing of all the deliverables done with User support. System Support - Assisted Client for User Acceptance Test for new releases.
- Actively involved in the production support and transferred knowledge to the other team members.
Environment: Informatica Power Center 8.1, Oracle9i, SQL Server, Erwin, UNIX Shell Scripting, MS Visio, TOAD, ERWIN, SQL-Loader, Mainframes, Control-M, ESP, CA-Workstation, PL/SQL Developer, UNIX, Z/OS, COBOL, JCL, DB2.
Confidential .
Informatica Developer
Responsibilities:
- Gathered requirements from the end users and Involved in analysis of source systems, business requirements and identification of business rules.
- Responsible for Analysis, Design and Implement of various data marts for BackOffice (Financial, payroll, Benefits and HR Modules) using Data Modeling Techniques
- Deploy Informatica Objects and Business Object Universes to TEST, UAT and Production.
- Involved in development of Informatica Mappings, CDC (Change Data Captures), Mapplets and Workflows for complex Business rules.
- Various Transformations like Expression, Lookups, Filters, Sequence Generator, Joiner, and Sorter were used to handle situations depending upon the requirement.
- Implemented Type2 slowly changing dimensions to keep track of Historical data.
- Performed incremental aggregation to load incremental data into aggregate tables.
- Made substantial contributions in simplifying the development and maintenance of ETL by creating re-usable Source, Confidential, Mapplets, and Transformation objects.
- Involved in moving Mappings, Sessions and Workflows between development and production environments. Worked on Power Exchange for change data capture (CDC).
- Involved in Unit and Integrating testing of Informatica Sessions, Batches and the Confidential Data.
- MigrateInformaticaobjects, Database objects to Integration Environment
- Performed Unit Testing and verified the data using Informatica Debugger break points.
- Tuned mappings and sessions for better performance on the data loads.
Environment: Informatica Power Center 7x/8.x, Oracle, SQL server, UNIX Shell Scripting, MS Visio, SQL Developer, Z/OS, COBOL, JCL, REXX and DB2.