Etl Talend Developer Resume
PlanO
SUMMARY:
- Around 8 years of experience in the IT Industry focused on ETL Solutions Implementations, Data warehousing, Data Quality, Development - Complete SDLC involving system study, analysis, design, development& implementation of application software for Client/Server
- Experience in developing ETL for supporting Data Extraction, transformations and loading using Informatica Power Center.
- Experience in working with Informatica components like workflows, mappings, mapplets, sessions, tasks, User Defined Functions (UDF), Debugger, partitioning, reusable components and extensively worked with session logs, workflow logs for error handling and troubleshooting on mapping failures.
- Developed complex mappings using different transformations like Expressions, Filters, Joiners, Routers, Union, Update Strategy, Unconnected / Connected Lookups, Normalizer and Aggregator.
- Experience in using the Informatica command line utilities like pmcmd to execute workflows in non-windows environments.
- Extensively used Informatica Repository Manager to migrate Informatica Code (XML, folders, Deployments groups) across various environments.
- Experience in Performance Tuning of sources, targets, transformations and sessions.
- Experience in documenting the ETL process flow for better maintenance and analyzing the process flow.
- Worked with Stored Procedures, Triggers, Cursors, Indexes and Functions.
- Worked on Hive QL to fetch the data from source and validating historic and incremental load
- Experience in UNIX shell scripting, CRON, FTP and file management in various UNIX environments.
- Experience with industry standard methodologies like waterfall, Agile within the software development life cycle.
- Worked on Netezza DB, T-SQL queries, complex Stored Procedures, User Defined Functions (UDF)
- Good understanding in database and data warehousing concepts (OLTP & OLAP).
- Experience in developing Unix Shell Scripts for automation
- Experienced in Database Installation, Configuration, Maintenance, Monitoring, Backup and Disaster Recovery procedure and Replication
- Experience in Data Warehouse applications testing using Informatica
- Involved in gathering requirements, developing Test Plans, Test Scripts, Test Cases & Test Data using specifications and design documents
- Excellent understanding of the System Development Life Cycle. Involved in analysis, design, development, testing, implementation, and maintenance of various applications.
- Performed Manual and Automated Testing on Client-Server and Web-based Applications.
- Extensive experience in drafting Test Flows, Test Plans, Test Strategies, Test Scenarios, Test Scripts, Test Specifications, Test Summaries, Test Procedures, Test cases & Test Status Reports.
- Strong troubleshooting and problem-solving skills. Strong ability to prioritize multiple tasks. Excellent communication skills, proactive, self-managing and teamwork spirits.
- Well-organized, goal-oriented, highly motivated effective team member with excellent analytical, troubleshooting, and problem-solving skills.
- Strong technical background, and possess excellent analytical & Communication skills, Creativity, Leadership qualities, Team spirit & above all a Positive attitude to shoulder successfully, any responsibility or challenges.
TECHNICAL SKILLS:
ETL Tools: Informatica Power Center 10/9.6/9.5/9.1/8.6/8.1 (Source Analyzer, Mapping Designer, Workflow Monitor, Workflow Manager, Power Connects for ERP and Mainframes, Power Plugs), Informatica Data Quality 9.5.1/9.1(Developer, Analyst), Power Exchange, Power Connect, Data Validation Option (DVO)
Databases: Oracle 12c/11g/10g, DB2, MS SQL Server 2008/2005, HiveQL, Netezza, Snowflake DB
Star Schema and Snow: Flake Modeling, Enterprise Data Vault, FACT and Dimensions Tables, Physical and Logical Data Modeling, ER Studio, Erwin
T ools: MS Office 2013/2010/2007/2003 , TOAD, Microsoft Visio, Putty, Erwin, ER Model, HP Quality center, MobaXterm
Programming Languages: Unix Shell Scripting, Oracle PL/SQL (Stored Procedures, triggers), Python 3.7
JCL, Control: M, Tidal, Autosys
WORK EXPERIENCE:
Confidential, Plano
ETL Talend Developer
Responsibilities:
- Design and developed end-to-end ETL process from various source systems to Staging area, from staging to AWS S3.
- Developed Python script to parse Nested Json and XML file to csv using pandas library
- Worked on shell script to extract data from AWS Egress bucket into CDW Egress based on SLA time.
- Optimized process efficiencies and reduced SLAs, developed robust cross-platform quality control processes with Agile principles while partnering with business teams to develop high performance Data models
- Worked on Python Script to add audit columns and then processed the data in Snowflake DB
- Designed, Developed and Optimized legacy jobs to reduce time complexity
- Provided L3 Support to business user for any job failure
- Developed ETL Jobs from various sources such as SQL, SAP and other files and loaded in snowflake database using Talend ETL Tool
- Built robust architectural frameworks enabling seamless integration of migrating systems to the high-volume governed financial data within Snowflake for data-science team and BI Analysts using Talend Management Console and Talend Remote Engine in a secure scaleable AWS environment.
- Published Jobs to TMC and scheduled these jobs through Autosys
- Interacted with business and gathered requirements based on changing needs
- Worked with parallel connectors for parallel processing to improve job performance while working with bulk data
- Have used AWS components (Amazon Web Services) - Downloading and uploading data files (with ETL) to AWS system using S3 talend components.
- Worked on joblets and amp java routines in Talend
- Managed Script which load data in daily record ingestion process
- Created Technical Design document from source to stage and stage to Target mapping
- Created Complex jobs and used transformations like tMap, tFilterRow, tjava, toracle, txmlMap, tdelimited files, tlogrow, tlogback, tlogcatcher, tstatcatcher
- Scheduled Talend job using job conductor (Scheduling Tool in Autosys)
- Assisted migrating the existing data center into AWS environment
- Designed the Process Control Table that would maintain the status of all the CDC jobs and thereby drive the load of Derived Master Tables
- Used Git for version control for Talend Jobs
Environment: : Talend 7.2 Cloud Version, Snowflake, Unix, Python 3.7, Pandas, Numpy, AWS EC2, EKS, S3, GIT, Docker
Confidential, MD
Informatica ETL Developer
Responsibilities:
- Develop Low Level Design (LLD) document capturing data elements mapping and SQL queries proving relationship across source files
- Load the data into data warehouse from different sources like flat files, COBOL Files, Oracle, Salesforce and review logic that needs to be incorporated in the architecture to ensure the data meets the requirements.
- Worked on atomizing the ETL process by creating automation script for different business requirements
- Develop Informatica Mappings and Reusable Transformations to facilitate timely Loading of Data of star schema and snowflake schema.
- Worked on creating complex business logics by creating data models for business requirements and translating the models into database, structures, objects and schemas
- Developed reusable mapplets and used the logic designed in other mappings
- Developed PL/SQL procedures/packages execute the SQL Loader control files/procedures to load the data into Oracle.
- Develop scripts for the creation of the physical data warehouse tables, views and macros.
- Designed and Created data warehouse structure tables on database
- Perform data profiling, cleansing, validation and verification with Informatica Power center and SQL Stored Procedure
- Monitor and troubleshoot Extract-Transform-Load (ETL) and system performance, including optimization and tuning of data loads.
- Involved in creating Jobs to improve the load process of data into physical tables
- Worked closely with DBA to deploy the code and scheduling jobs on Tidal and provide support in case of any failure.
- Understand job and activity schedules to promote optimal performance, maximize availability and minimize data latency and data load duration
- Capture data changes with the help of SCD Type 1 and Type 2 mappings and utilize MD5 to accomplish Dimension, Facts and error record loading.
- Load and store the data from different sources like Oracle, SQL Server, Flat File and store it in Data warehouse tables which is on Netezza
- Snowpipe to continously load micro-batches of data into staging tables for transformation and optimization using automated tasks and the change data capture (CDC) information in streams.
- Used Bulk Loading to load batch of data from files available local server to Snowflake Cloud Storage.
- ETL is used to implement Slowly Changing Dimension to store History data
- Use various transformations like Union, Filter, Router, Sequence Generator, Look Ups, Update Strategy, Joiner, Source Qualifier, Expression, Sorter, and Aggregator to design application
- Worked on Union, Filter, Router, Sequence Generator, Look Ups, Update Strategy, Joiner, Source Qualifier, Expression, Sorter, and Aggregator to load structured data in structured form
- Worked on HTTPS and Java Transformation to call third party API where API’s are parsed through Java Transformation to load data in Data warehouse
- Worked on enhancement mapping with new business requirement
- Developed Unix shell scripting to archive the files and also capture the rejects files and send to business user for validation
- Monitor production support jobs and provide fixes to cure software defects/failures
- Deploy Informatica Mappings, Unix Shell Scripts and SQLs in QA and production environments.
- ETL Production Batch monitoring Jobs, investigate & fix job failures.
- Design and developed end-to-end ETL process from various source systems to Staging area, from staging to AWS S3.
- Analyzing the source data to know the quality of data by using Talend Data Quality.
- Broad design, development and testing experience with Talend Integration Suite and knowledge in Performance Tuning of mappings.
- Developed jobs in Talend Enterprise edition from stage to source, intermediate, conversion and target.
- POC to populate the data in front-end by calling rest api and create swagger docs
- Push the code on GIT and created pull request
- Used tStatsCatcher, tDie, tLogRow to create a generic joblet to store processing stats.
- Solid experience in implementing complex business rules by creating re-usable transformations and robust mappings using Talend transformations like tConvertType, tSortRow, tReplace, tAggregateRow, tUnite etc.
- Developed Talend jobs to populate the claims data-to-data warehouse - star schema
- Scheduled Jobs on TAC server for automation
Environment : Informatica Power Center 10.2 (Source Analyzer, Warehouse designer, Mapping Designer, Transformation Developer , Mapplet Designer, Repository Manager, Workflow Manager), Oracle 10g and Oracle 12c, PL/SQL, Netezza, Salesforce Loader, Unix Shell Scripts, Tidal, Talend Open Studio, AWS S3, TAC.
Confidential, MD
ETL developer
Responsibilities:
- Design, Development and Documentation of the ETL (Extract, Transformation & Load) strategy to populate the data store from various source systems.
- Create complex data loading processes for Operational Data Stores, Data Warehouses and Data Marts that satisfy all functional and technical requirements
- Worked closely with Data governance team and Data warehouse architect to understand the source data and need of the Warehouse.
- Involved in analysis of data in multiple source systems and worked with DGS for creating source to target mappings to load data into Data warehouse.
- Extensively used Informatica 10 in loading data from Flat Files, sources and Oracle.
- Used Normalizer transformation while loading the data from the flat files, oracle staging area.
- Involved in design and development of complex ETL mappings to load the data from various source systems to data store and generating outbound XML files.
- Created mappings to read inbound XML files to load the error reports into data store.
- Based on the requirements, used various transformation like Source Qualifier, Normalizer, Expression, Filter, Router, Update strategy, Sorter, Lookup, Aggregator, Joiner, Stored procedure transformations in the mapping.
- Developed Informatica SCD Type-II mappings to keep track of historical data.
- Developed mapplets and worklets for reusability.
- Developed workflow tasks like reusable Email, Event wait, Timer, Command, Decision.
- Involved in performance tuning of mappings, transformations and (workflow) sessions to optimize session performance.
- Used Informatica debugging techniques to debug the mappings and used session log files and bad files to trace errors occurred while loading.
- Created UNIX Shell scripts and called using command task to FTP the outbound files.
- Developed Documentation for all the routines (Mappings, Sessions and Workflows).
- Worked on Mainframe to create JCL to run informatica workflows in test environments and created ZMF Package to migrate and schedule the JCL’s in Production environment.
- Creating Test cases and detailed documentation for Unit Test, System, Integration Test and UAT to check the data quality.
Environment : Informatica Power Center 10 (Source Analyzer, AWS, Warehouse designer, Mapping Designer, Transformation Developer , Mapplet Designer, Repository Manager, WorkFlow Manager), Oracle 10g, PL/SQL, SQL Server, PUTTY.
Confidential, Chicago, IL
ETL Developer/ETL Tester
Responsibilities:
- Involved in gathering Business Requirements and writing Technical specifications.
- Created technical specification for the development of Informatica extraction, transformation and loading (ETL) mappings to load data into classified tables.
- Installed Informatica tool kit (Power center, IDQ, B2B) version 9.5.1 on Unix servers in all environments.
- Worked on Logical and Physical data models using Erwin data modeler.
- Assigning tasks to the junior developers on a daily scrum methodology.
- Worked extensively on the ETL designs and development for HUB’s, SATELLITE’s, LINK’s, HLINK’s ETL mappings using power center 9.5.1.
- Extracting data from flat file format, applying transformation logic using Data Transformation, creating B2B Data Exchange XML format and invoking B2B command utilities to store it in B2B DX repository.
- Designed and developed slowly changed dimensions (SCD) Type 2 mappings in power center designer for loading data into analytic layer.
- Created IDQ data profile and score cards for the research users to analyze the patient trends.
- Performed Code review to ensure that ETL development was done according to the company’s ETL standard and that ETL best practices were followed.
- Created Workflows with worklets, event wait, decision box, and email and command tasks using Workflow Manager and monitored them in Workflow Monitor.
- Created Pre/Post Session/SQL commands in sessions and mappings on the target instance.
- Created and executed stored procedures for the initial bulk loads.
- Designed Automation process for daily/weekly ETL routine.
- Worked on ETL strategy to store data validation rules, error handling methods to handle both expected and non-expected errors and documented it carefully.
- Responsible for migrating Informatica code across different environments (Dev, Test, QA, PROD).
- Maintained warehouse metadata, naming standards and warehouse standards for future application development.
- Conformed to project standards for Unit test/UAT testing. Carried out end - to end testing and supported UAT effort with immediate requirement document change/fixes/resolution for all changes/defects.
- Involved in database size estimation and in allocating the required SAN space across different layers.
- Developed UNIX shell scripts to move source files from source server to informatica server and archive files after the load process is completed.
- Created ETL design documents and flow diagrams.
- Used HP Quality center to document ETL test plans, test cases, test scripts, test procedures, assumptions, and validations based on design specifications for unit testing, system testing, preparing test data and loading for testing, error handling and analysis.
- Maintain the daily ETL schedule and recover the daily failures and generate the daily reports for users.
- Created Test plans and Test cases by understanding the BRD specifications and Mapping documents.
- Validated the data across different layers (Source/file to Stage, Stage to ODS, ODS to Target) based on the mapping document.
- Validated SRC to TGT data for Bulk and Incremental loads from various front-end cross BU applications.
- Most of the source data will be in the form of flat files and are acceded via Unix dev box location and some source db's accessed via CITRIX server.
- Validated the (Inbound - Outbound) stages, Stage to ODS and ODS to Target data and captured counts, validated all the direct and derived columns (per ETL logic) using Oracle.
- All stages are validated for data discrepancy, record counts, completeness, table counts and column counts, duplicates, unique value, referential integrity, end to end business scenarios and initial and incremental load testing, negative and positive test scenarios. etc.
- Validated data from Stage to Datastore for each cycle date, verified the stage rolling window for each project.
- All defects are logged as per the HP ALM/QC, will be triaged in the daily defect tracking meeting.
Environment: Informatica Power Center 9.5.1, IDQ 9.5.1, Informatica DVO 9.5.1, Erwin, UNIX, HP Quality Center, HP-ALM, SQL Server, Oracle SQL developer, B2B DT Studio.
Confidential
ETL Engineer
Responsibilities:
- Heavily involved in analyzing 16+ data sources for marketing data.
- Designed/developed/Implemented the sourcing of 16+ data sources with Informatica 9.1.
- Integrated data from EDW stage to marketing data mart with customer 360 using Informatica 9.1.
- Created Informatica power center design patterns to source multiple sources in a single mapping to reduce load time
- Used various transformations like Source Qualifier, Expression, Aggregator, Joiner, Filter, Sequence generator, Lookup, Update Strategy, Normalizer for development and optimizing the Mapping.
- Created Informatica power center mappings to load data into landing tables, stage tables and base objects in MDM and ran batch jobs which load data into base objects.
- Tuned the performance of mappings by following Informatica best practices and also applied several methods to get best performance by decreasing the run time of workflows.
- Worked extensively on Informatica Partitioning when dealing with huge volumes of data and partitioned the tables for optimal performance.
- Prepared the error-handling document to maintain the Error handling process.
- Worked with Session logs and Workflow logs for Error handling and troubleshooting during the job failures.
- Extensively worked on Performance Tuning at the mapping level, session level, source level, and the target level for the mappings which were taking long time to complete.
- Extensively worked on handling production defects and change requests.
- Analyze source data for potential data quality issues and created the exception mappings/clean up scripts to clean up the bad data in data warehouse.
- Effectively handled the Production Support by performing bulk loads, daily loads and monthly loads.
- Developed reconciliation scripts to validate the data loaded in the tables as part of unit testing.
- Prepared SQL Queries to validate the data in both source and target databases.
- Prepared scripts to email the records that do not satisfy the business rules (Error Records) to the uploaded business users.
- Created a shell scripting in UNIX platform for running Informatica workflows, updating the Parmfiles, sending emails and the FTPing the Files.
- Involved in the Repository administration like Backup and restoration process when required.
- Supported the applications in Production on rotation basis and provided solutions for failed jobs.
- Prepared Game Plans for the Deployment of Informatica Code.
Environment: Informatica Power Center 9.1, Oracle 10g, MS SQL Server, Putty, Toad, IDQ, MS Office.
Confidential
Informatica Developer
Responsibilities:
- Conducted JAD sessions with End User Meetings and Responsible for gathering Requirements
- Responsible for converting Functional Requirements into Technical Specifications
- Involved in requirement analysis, ETL design and development for extracting data from the heterogeneous source systems like MS SQL Server, Oracle, HL7 Files, flat files, XML files and loading into Staging and Enterprise Data Warehouse
- Involved in massive data profiling using IDQ (Analyst tool) prior to data staging.
- Involved in deploying applications in Informatica developer.
- Designed and developed Mappings for loading MDM Hub
- Extensively used Informatica client tools Source Analyzer, Warehouse designer, Mapping Designer, Mapplet Designer, Transformation Developer, Informatica Repository Manager and Informatica Workflow Manager.
- Design and Development of ETL routines, using Informatica Power Center within the Informatica Mappings, usage of Lookups, Aggregator, XML, Ranking, Mapplets, connected and unconnected stored procedures / functions / Lookups, SQL overrides usage in Lookups and source filter usage in Source qualifiers and data flow management into multiple targets using Routers were extensively done.
- Converted SQL Server SSIS Packages logic into Informatica mappings.
- Created complex mappings with shared objects/Reusable Transformations/Mapplets for staging unstructured HL7 files into Data Vault
- Created workflow design to handle various loads like Daily Loads, Weekly Loads, and Monthly files/loads using Incremental Loading Strategy
- Automated daily load reports to send messages to the concerned personal in case of process failures and discrepancies.
- Created sequential/concurrent Sessions/ Batches for data loading process and used Pre-& Post Session SQL Script to meet business logic.
- Extensively worked on batch scripts to automate workflows and to populate parameter files.
- Created Stored Procedures to transform the Data and worked extensively in T-SQL for various needs of the transformations while loading the data into Data vault
- Used SQL tools like management studio to run SQL queries and validate the data.
- Tuning the Mappings for Optimum Performance, Dependencies and Batch Design.
- Designed and developed the logic for handling slowly changing dimension table’s load by flagging the record using update strategy for populating the desired.
- Involved in Unit testing, User Acceptance Testing to check whether the data is loading into target, which was extracted from different source systems according to the user requirements.
Environment: Informatica Power Center 8.6, Embarcadero, SQL SERVER 2008, T-SQL, Batch Scripts, Rational Suite, SVN.
Confidential
DW Developer
Responsibilities:
- Interacted with business users for requirement analysis and to define business and functional specifications.
- Documented user requirements translated requirements into system solutions and developed implementation plan and schedule.
- Involved in warehouse architecture design to populate data from various sources to Teradata health care logical model.
- Extracted data from DB2 VSAM files, XML Files, Flat Files and populated into EDW
- Experience in writing and optimizing SQL code in TERADATA v6r12, Sybase 8.5 and stored procedures.
- Developed Complex transformations, Mapplets using Informatica Power Center 8.6.1/9.1.0 to Extract, Transform and load data into DataMart’s, Enterprise Data warehouse (EDW) and Operational data store (ODS).
- Excellent command of software development life cycle activities including analysis, design, development, unit and system testing and production deployment.
- Developed data Mappings, Transformations between source systems and warehouse
- Performed Type1 and Type2 slowly changing dimension mappings
- Managed the Metadata associated with the ETL processes used to populate the data warehouse.
- Implemented Aggregate, Filter, Join, Expression, Lookup and Update Strategy transformations.
- Used debugger to test the mapping and fixed the bugs.
- Implemented change data capture (CDC) using informatica power exchange to load data from clarity DB to Teradata warehouse.
- Developed multiples data marts using Teradata for meeting the reporting requirements.
- Developed process for Teradata using shell scripting and RDBMS utilities such as MLoad, Fast Load(Teradata).
- Created sessions, sequential and concurrent batches for proper execution of mappings using server manager.
- Migrated development mappings as well as hot fixes them in production environment.
- Tuned performance of Informatica session for large data files by increasing block size, data cache size, sequence buffer length and target based commit interval.
- Organized data in the report Inserting Filters, Sorting, Ranking and highlighting data.
- Included data from various sources like Oracle Stored Procedures and Personal data files in the same report.
- Executed sessions, both sequential and concurrent for efficient execution of mappings and used other tasks like event wait, event raise, email, command and pre/post SQL.
- Involved in analyzing the year end process and managed the loads that have yearend aggregations.
- Used the command line program pmcmd to run Informatica jobs from command line. And used these commands in shell scripts to create, schedule and control workflows, tasks and sessions.
- Responsible for daily verification that all scripts, downloads, and file copies were executed as planned, troubleshooting any steps that failed, and providing both immediate and long-term problem resolution.
- Provided detailed technical, process and support documentation like daily process rollback and detailed specifications and very detailed document of all the projects with the workflows and their dependencies.
- Involved in writing shell scripts for file transfers, file renaming and several other database scripts to be executed from UNIX.
- Created and Documented ETL Test Plans, Test Cases, Test Scripts, Expected Results, Assumptions and Validations.
Environment: Informatica Power Center 8.6, Sybase, SQL, PL/SQL, TOAD, SAP, Teradata v6r12, Erwin, Shell Scripts, Rational Suite, Business Objects XI R2