Etl Big Data Developer Resume
Canton, MD
EXPERIENCE SUMMARY:
- Over 7+ years of extensive experience with ETL tool Informatica Developer Client (BDM), Power Center, Power Exchange in designing and developing Mappings, Mapplets, Transformations, Workflows, Worklets, Configuring the Informatica Server and scheduling the Workflows and sessions.
- Strong Database experience like Oracle 12/11g/10g, MS SQL Server. Strong Experience on database interfaces like SQL Plus and TOAD.
- Strong experience in ETL Designing, Data Conversion, Data Warehouse, Data Mart Design, Star Schema and Snowflake Schema methodologies, Unit Testing and Documentation.
- Extensive experience in creating jobs using Big data components like Hadoop, Hive, HDFS, Spark.
- Proficient in understanding business requirements & translating them into technical design.
- Implemented data warehousing methodologies for Extraction, Transformation and Loading using Informatica Designer, Repository Manager, Workflow Manager, Workflow Monitor.
- Extensive experience in designing and developing complex mappings using various transformation logic like Unconnected / Connected lookups, Source Qualifier, Update strategy, Router, Filter, Expression, Normalizer, Aggregator, Joiner, User Defined Functions.
- Worked with Informatica Data Quality (IDQ) toolkit, Analysis, Data Cleansing, Data matching, data conversion, exception handling, reporting and monitoring capabilities.
- Extensive ETL experience Using Different ETL tools, Data Transformation Services (DTS), Server OLTP, OLAP, Erwin, Oracle and DB2.
- Strong experience in SQL, Tables, Database, Materialized views, Synonyms, Sequences, Stored Procedures, Functions, Packages, Triggers, Joins, Cursors and indexes in Oracle.
- Experience in Slowly Changing Dimensions Type 1, Type 2 and Type 3 for inserting and updating Target tables for maintaining the history.
- Experience on Debugger to validate the mappings and gain troubleshooting information about data and error conditions.
- Hands - on experience in Apache Hadoop ecosystem components like Hadoop Map Reduce, Kafka, HDFS, Zoo Keeper, Oozie, Hive, Sqoop, Pig, Flume.
- Extensively used Hadoop, spark streaming libraries and Apache Kafka for analyzing real time datasets..
- Hands on experience on in tools like Address Doctor which is used for Address validation.
- Experience in HIPAA 5010 EDI X12 transaction codes such as 834/837, 270/271(inquire/response health care benefits), 276/277(Claim status), 834(Benefit enrolment), 837(Health care claim).
- Experience in working with UNIX Shell Scripts for automatically running sessions, aborting sessions and creating parameter files.
- Have performed various testing activities like Unit Testing, Integration, System Testing, Regression Testing and User Acceptance Testing and ensured that the code is ideal.
- Highly motivated team player with excellent analytical and problem-solving skills possessing the ability to effectively communicate with higher management and stakeholders.
TECHNICAL SKILLS:
Data Warehousing / ETL Tools: Informatica 10.x,9.x,8.x (Repository Admin Console, Repository Manager, Designer, Workflow Manger, Workflow Monitor, Power Exchange), Informatica IDQ, Informatica MDM, Informatica Big Data Edition (BDE/BDM) (Hive, Impala, Flume, Spark, HDFS), Oracle Warehouse Builder.
Data Modeling: Star Schema Modelling, Snow Flake Modelling, MS Visio 2010, Erwin
Databases: Oracle 12/11g/10g, DB2, MS SQL Server 2014/2012, Netezza, MongoDB, Hive
Languages: SQL, Python, PL/SQL, HTML, XML, C, Unix Shell Script
Operating Systems: Windows XP 10/8/7, UNIX & Linux
Tools: Toad, SQL* plus, Putty, Tidal, Autosys, SoapUI, MicroStrategyWORK EXPERIENCE:
Confidential - Canton, MD
ETL BIG DATA DEVELOPER
Responsibilities:
- Analyze the BRDs and understand all functional requirements working with business and map the requirements in Facets.
- Worked on Informatica Big Data tool moving data from Oracle, Flat File, Json file to Hive. (Cloudera Distribution)
- Defined Trust and validation rules for the base tables& created PL/SQL procedures to load data from Source Tables to Staging Tables
- Involved in Facets 5.30/4.81/4.41, with deferent modules like Subscriber/Member, Open Enrollment, Claims Processing, Network, Billing and Providers Applications.
- Involved Validations for Electronic Claims: EMC, UB04 and CMS 1500 for Hospital Pended Claims Report and Accepted Claims by Tax ID Report.
- Involved Validations for EDI 837 Institutional, Professional Claims and EDI 835 Payment Advice.
- Create HIPPA based EDI X12 (834/837) files for member enrollment and claims processing testing which include 834/837's Professional/Institutional/Dental.
- Created Parquet files on the Hadoop server by writing Python scripts.
- Created the mappings using informatica big data edition tool and all the job areas are subjected to run on Hive, Blaze or Spark engines.
- Created all JDBC, ODBC and HIVE connections to the Informatica Developer Client Tool to import the parquet files and the relation tables.
- Have been Responsible for building scalable distributed data solutions using Hadoop and migrate legacy Retail applications ETL to Hadoop.
- Tested Claims processing flow by dropping claims through the HIPAA gateway.
- Translate, load and exhibit unrelated data sets in various formats and sources like JSON, text files, and Kafka queues.
- Enrollment and Maintenance EDI data from employer groups to various health insurance payer
- Creating a test data in Facets PPMO 5 based on the business requirements.
- Integrated Apache Storm with Kafka to perform web analytics and to perform clickstream data from Kafka to HDFS.
- Migrated legacy SQL Agent Jobs to Control M jobs using Control M V7 Control M Desktop.
- Created complex control M jobs based upon a file watcher trigger and dependency of previous day end of all smart tables.
- Working closely with the client on planning to migrate the current RDBMS to Hadoop.
- Worked with MicroStrategy Mobile Team in converting the dashboard to the IPHONE version.
- Configured Spark streaming to receive real time data from Kafka and store the stream data to HDFS using Scala.
- Migration of informatica jobs to Hadoop Sqoop jobs and load into oracle database.
Environment: Informatica 10.1/9.6, Informatica Developer Client (BDM), Informatica Monitoring, Tomcat, Hive, Impala, Flume, Spark, Python, Parquet Files, Hue, SQL Server 2014, PL/SQL, SQL Assistant, Control-M, Oracle 12/11g, MS Office, Business Objects XIR4, Windows and Unix, Shell Scripting.
ETL DEVELOPER/ ANALYST
Confidential - Omaha, NE
Responsibilities:
- Developing the ETL components as well as Oracle procedures, functions & triggers.
- Defined Trust and validation rules for the base tables& created PL/SQL procedures to load data from Source Tables to Staging Tables
- Created, executed and managed ETL processes using Oracle Data Integration (ODI) & Customized ODI Knowledge modules like Loading Knowledge Modules and Integrated Knowledge Modules.
- Involved in implementing the Land Process of loading the customer Data Set into Informatica MDM from various source systems.
- Involved in Installing and Configuring of Informatica MDM Hub Console, Hub Store, Cleanse and Match Server, Address Doctor, Informatica Power Center applications.
- Used IDQ transformation like labels, standardizing, proofing, address doctor, Match, Exception transformations for standardizing, profiling and scoring the data.
- Performed match/merge, run match rules to check the effectiveness of MDM on data, and fine- tuned match rules.
- Performed land process to load data into landing tables of MDM Hub using external batch processing for initial data load in hub store.
- Data governance application for Informatica MDM Hub that enables business users to effectively create, manage, consume, and monitor master data using IDD.
- Worked on IDQ parsing, IDQ Standardization, matching, IDQ web services.
- Data governance application for Informatica MDM Hub that enables business users to effectively create, manage, consume, and monitor master data using IDD.
- Responsible in using Data Integration Hub (DIH) in creating topics and applications to publish and subscribe data.
- Scheduling the workflows using Tidal as per business needs & Passed the parameters to work flow from the tool Tidal directly to run the map.
- Scheduled the batch jobs in Autosys to automate the process.
- Designed and Developed SSIS Packages using various Control Flow and Data Flow items to Transform and load the Data from various Databases using SSIS.
- Used tidal to schedule and run Informatica jobs.
Environment: Informatica 10/9.6, IDQ 9.6, Informatica MDM 9.6, SQL 2014, PL/SQL, SQL Assistant, MongoDB, Netezza, Oracle 11g/10g, Agile, MS Office, Business Objects XIR4, Java, Windows and Unix, Shell Scripting.
ETL/Informatica Developer
Confidential - Owings Mills, MD
Responsibilities:
- Used Informatica Developer Client and Designer to create complex mappings using different transformations like Filter, Router, Connected & Unconnected lookups, Stored Procedure, Joiner, Update Strategy, Expressions, and Aggregator transformations to pipeline data to Data Warehouse.
- Interacted with business community and gathered requirements based on changing needs. Incorporated identified factors into Informatica mappings to build Data Warehouses.
- Strong HIPAA EDI 4010 and 5010 with ICD-9 and ICD-10, analysis & compliance experience from, Healthcare payers, providers and exchanges perspective, with primary focus on Coordination of benefits.
- Imported the Power center mappings using Informatica big data edition tool.
- Developed a standard ETL framework to enable the reusability of similar logic across the board. Involved in System Documentation of Dataflow and methodology.
- Developed mappings to extract data from SQL Server, Oracle, Flat files, DB2, Mainframes and load into Data warehouse using the Power Center, Power exchange.
- Created SSIS Packages to migrate slowly changing dimensions. Expert knowledge SQL Queries, Triggers, PL/SQL Procedures, Packages to apply and maintain the Business Rules.
- Conducted Database testing to check Constrains, field size, Indexes, Stored Procedures, created testing metrics using MS-Excel
- Actively involved in production support. Implemented fixes/solutions to issues/tickets raised by user community.
- Used IDQ to profile the project source data, define or confirm the definition of the metadata, cleanse and accuracy check the project data, check for duplicate or redundant records, and provide information on how to proceed with ETL processes.
- Designed and developed UNIX shell scripts to schedule jobs. Also, wrote pre-session and post-session shell scripts, Wrote PL/SQL scripts for dropping, rebuilding indexes.
- Performed Configuration Management to Migrate Informatica mappings/sessions/workflows from Development to Test to production environment.
- Wrote UNIX shell scripts to automate the process of renaming the Flat files with a timestamp extension, Compressing, Archiving, the Workflows, File Manipulations, and automatic email notifications to the data owners (Success / Failure) and FTP to the desired locations.
- Collaborated with Informatica Admin in process of Informatica Upgradation from Power Center 9.1 to Power Center 10.1
Environment: Informatica Power Center 9.6/9.1/8.6, Informatica IDQ/BDM Developer Client (10.1), SQL Server 2012/2010, DB2, Oracle 10g/11, SQL, PL/SQL, T-SQL, TOAD 9.0, Mainframes, Autosys, UNIX.
ETL/Informatica Developer
Confidential, Arlington Heights, IL
Responsibilities:
- Interacted with End Users for gathering requirements.
- Developed standard ETL framework for the ease of reusing similar logic across the board.
- Involved in data profiling, Cleansing, parsing, standardize, address validation & Match merge the data through Informatica developer and analyzer.
- Used Repository manager to create Repository, User groups, Users and managed users by setting up their privileges and profile.
- Participated in Designing and documenting validation rules, error handling of ETL process.
- Worked extensively with Mapping Parameters, Mapping Variables and Parameter files for Incremental Loading.
- Integrate IDQ mappings and mapplets in to power center and execute.
- Create new rules based on business requirement & implement through IDQ.
- Developed mappings to extract data from different sources like DB2, XML files are loaded into Data Mart.
- Created complex mappings by using different transformations like Filter, Router, Connected and Unconnected lookups, Stored Procedure, Joiner, Update Strategy, Expressions and Aggregator transformations to pipeline data to Data Mart.
- Made use of Informatica Workflow Manager extensively to create, schedule, execute and monitor sessions, Worklets and workflows.
- Developed Slowly Changing Dimension for type I and type II (flag, version and date).
- Involved in designing Logical/Physical Data Models, reverse engineering for the entire subject across the schema's using Erwin/TOAD.
- Scheduling and Automation of ETL processes with scheduling tool in Autosys.
- Scheduled the workflows using Shell script. Creating Informatica Development Standards.
- Troubleshoot database, workflows, mappings, source, and target to find out the bottlenecks and improved the performance.
- Migrated Informatica mappings/sessions /workflows from Development to Test and to production environment.
Environment: Informatica Power Center 9.X, IDQ, Oracle 10g/11g, Putty, SQL Server, PL-SQL, TOAD WinScp, T-SQL, UNIX Shell Script, Tidal scheduler.
ETL/Informatica Developer
Confidential
Responsibilities:
- Involved in requirements gathering, functional/technical specification, Designing and development of end-to-end ETL process for Data Warehouse.
- Studied the existing OLTP system(s) and Created facts, dimensions and star schema representation for the data mart.
- Used Informatica power center for (ETL) extraction, transformation and loading data from heterogeneous source systems.
- Imported Source/Target Tables from the respective databases and created reusable transformations (Joiner, Routers, Lookups, Rank, Filter, Expression, and Aggregator) in a Mapplet and created new mappings using Designer module of Informatica.
- Created Tasks, Workflows, Sessions to move the data at specific intervals on demand using Workflow Manager and Workflow Monitor.
- Extensively worked on the performance tuning of the Mappings as well as the sessions.
- Involved in writing UNIX shell scripts for Informatics ETL tool to run the Sessions.
- Working with database connections, SQL joins, cardinalities, loops, aliases, views, aggregate conditions, parsing of objects and hierarchies
- Wrote SQL, PL/SQL, stored procedures for implementing business rules and transformations
- Coordinated with DBA for tuning sources, targets and calculating table space and growth of DB.
- Migrated code from DEV to QA and then to PROD Repositories while in our monthly releases.
- Prepared a technical documentation and SOP's for the existing system.
- Created test cases and completed unit, integration and system tests for Data Warehouse.
- Used Autosys to schedule jobs.
- Assist with migrating jobs into production.
- Responsible for problem identification and resolution of failed processes or jobs and assisted customers in troubleshooting issues with their applications in Autosys.
- Proactive in finding issues and seeking and implementing solutions.
- Perform daily Production Support recurring tasks and monitoring of the tickets queue, assign tickets, resolve tickets, update root cause and remediation.
Environment: Informatica Power Center 8.6/8.1, Oracle 10g, SQL Server, flat files, DB2, Autosys, Erwin R7, SQL, PL/SQL, SQL*PLUS, Shell Scripting
ETL Developer
Confidential
Responsibilities:
- Created complex mappings using various Transformations such as the Source qualifier, Aggregator, Expression, lookup, Filter, Router, Rank, Sequence Generator, Update Strategy, Transaction control, Sorter etc. as per the business requirements
- Extracted data from heterogeneous sources like Flat files, CSV Files and Oracle database and applied business logic to load them in to the Data warehouse
- Extensively used Informatica client tools - Source Analyzer, Warehouse designer, Mapping designer, Mapplet Designer, Transformation Developer, Informatica Repository Manager.
- Designed workflows with many sessions, tasks (Decision, Assignment task, Event wait) and used Informatica scheduler to schedule jobs.
- Created workflows and worklets for designed mappings.
- Used BIDS to create parameterized reports and created several Excel reports using cube data
- Involved in data cleansing and Performance Tuning of ETL Mappings and sessions (using Partitioning, Pushdown Optimization etc).
- Effectively used Informatica parameter files for defining mapping variables, workflow variables, FTP connections and relational connections.
- Generated Email Notifications and status reports using Email Tasks in Workflow manager.
- Performed Unit testing at various levels of the ETL and actively involved in team code reviews. Identified problems in existing production and developed one-time scripts.
- Performed Data integration techniques for data warehouse using Informatica power center.
- Created the SSIS packages to load the Staging environment. Created SSIS packages to extract/write data to Flat Files.
- Implemented Incremental load used Event Handlers to clean the data from different data sources.
- Written command line scripts and embedded them into SSIS package to merge/zip files.
Environment: Informatica, SSIS, Oracle, Toad, SQL Server, UNIX, SQL, PL/SQL, Shell Scripting