Senior Solution Architect Resume
Charlotte, NC
SUMMARY:
- 20 years of experience into Data Warehouse Data Integration, Data Migration from On premise to Cloud, Data Ingestion, Data Replication on cloud, Data Governance,
- Data Lake, data injection and analytics, on premise and cloud integration efforts from Planning, execution, implementation, and support.
- Working Experience on Snowflake, SnowSql, Snowstream, Snow Pipes, Ingesting streaming and data replication from on - prem or cloud to Snowflake.
- Working experience of various Migration framework phases Including Architecture discovery, Engineering Discovery, Source discovery, Tools and service exploration, Gap Analysis, re-architecture, re-platform /Lift and shift
- Proven exposure from kickoff, resource identification, planning, task/project scope prioritization, discovery phase, service and cost comparison between different cloud offerings,
- Experience in AWS data Analytic stack encompassing services like AWS DMS, AWS SCT, VPS, IAM, RDS, S3, LAMBDA, AWS Glue, AWS managed Airflow, AWS Athena AWS Kinesis Stream, AWS Kinesis Firehose, Knowledge on Kafka, AWS Data Sync and Snowflake / Redshift, AWS Data Lake formation, AWS Secret Manager.
- Experience in building event driven, time driven and hybrid architecture leading to stable data platform
- Expertized in creating functional process, landscape, dataflow, technical architecture diagrams
- Working experience in designing data ingestion pipeline from on-prem to cloud, batch and stream processing pipelines/workloads with Spark, Scala, Python, shell scripting in integration with AWS services (EMR, Glue, data pipeline, step function etc..), Informatica, SSIS and other open sources and ETL tools.
- Worked extensively on Design / Modeling and Development of Data Warehouse, ETL Architecture/Flow, ETL mappings using Informatica Power Center Designer10x (Source Analyzer, Warehouse designer, Mapping designer, Mapplet Designer, Transformation Developer), Workflow Manager & Workflow Monitor.
- Have worked on different framework and methodologies - SaFe Agile, Kanban, Iterative, Waterfall
- Proficient in all phases of Requirement Management, including requirement gathering, performing Gap Analysis, Impact Analysis.
- Proven experience in Leading and Mentoring the implementation (End to End project execution) utilizing Technical and Functional skills and by adopting best practices of the IT industry on the implementation.
- Worked in various domains - Insurance, Banking, Utility, Manufacturing and Airline.
TECHNICAL SKILLS
Data Warehouse: Data Modeling and Architecting, Data Integration and design ETL solution, implement dimensional model, Inmon’s top-down and Kimball’s bottom-up approach data processing mechanisms
Cloud Data Warehouse: Snowflake/Redshift
Data Integration: involving Multiple homogeneous and Heterogeneous source.
Cloud Services: AWS S3, IAM, AWS Glue, AWS Managed Airflow, Athena, Lambda, SQS, SNS, AWS Secret Manager
ETL: Informatica Power Center developer/Exchange/MDM, Oracle Data Integrator (ODI), Teradata - Teradata Parallel Transporter (TPT), FastLoad, MultiLoad, BTEQ and Fast-Export. SSIS 2008 R2
Relational Databases: Oracle 12x. DB2, SQL Server 2008 R2
Languages: Python/Spark, Unix Shell, AWK, SED, PERL SqlLoader /SqlExecute
Others: TOAD, SQLLoader, Microstrategy, T-SQL, OBIEE, Erwin, VISIO, CA7/AUTOSYS Scheduler, HP ALM Quality Center, MS Office 2013/07/03, NDM File Transfer, Harvest, Github, UDeploy, Jenkins,SnowSQL, Jira
PROFESSIONAL EXPERIENCE:
Confidential
Senior Solution Architect
Responsibilities:
- Leading the effort to modernize and enhance the data ingestion process post-merger
- Discovery across different components:
- Architecture: Analysis of current and target state architecture
- Engineering: Extraction/Onboarding/consumption patterns etc
- Data Source: Identification and understanding of various data sources
- Identify Work Environment requirements (dev/UAT/Prod/Access etc)
- Tool / Service Discovery
- Identify current gaps in Data Managements / Monitoring / Governance
- Identification and classification of PI data
- Architect the Solution to convert schedule driven pipeline to event driven data pipeline and ingest the data from various sources (Google Analytics/Big Query, Salesforce, Kafka stream)
- Designing data replication strategy from old RAW bucket to new S3.
- Orchestrating different task for data extraction, standardization, transformation and CSV to parquet file conversion using AIRFLOW.
- Create and manage, govern Data catalog and (exploring solutions and third party services options)
- Creating a metadata driven cloud ingestion creating the catalog on RDS Postgres database to govern, version and trace/audit activity in pipeline
- Hands on Python / Snowflake / Airflow / PG SQL / Lambda / AWS
- Architect and design data pipeline catering Batch(On Schedule) as well as Streaming (Event Driven approach)
- Worked on Various POCs when needed.
- Presented Recommendations to Client projecting Gaps with existing process, Clients Objective for the effort, Tool Discovery and comparison, Cost Analysis and comparison, Storage Topology, Functional flow, Technical Architecture, Data Pipeline and Metadata Model to maintain schema version logs and Custom log statistics.
- Support and Guide the development team as per needs.
Confidential, Charlotte NC
Integration/Migration Architect
Responsibilities:
- Responsible for Data Migration discovery phase, planning, execution to AWS cloud, infrastructure with DMS, SCT, S3, Snowflake, AWS Glue, Lambda, EC2, IAM, VPC, monitoring, logging and cost management.
- On-prem Oracle to Snowflake.
- Pilot implementation in discovery phase
- Analysis / refactoring on outcomes of pilot phase
- Mapping data types on snowflake for physical schema implementation on snowflake
- Building schema on snowflake - making snowflake ingestion ready.
- Data Extraction from Oracle on prem warehouse.
- Data ingestion to Snowflake using python framework for parallel loads.
- Rearchitecting Refactoring and redesign of existing Informatica ETL process for data replication into Snowflake
- Data Replication - processing the delta's via local bucket
- Setting up continuous data ingestion from AWS S3 bucket to snowflake using SQS, AWS Lambda, Snowpipe, SnowStream and task to raw tables and mering to target table on Snowflake
- Lake setup on S3 - global bucket
- Involved in designing future phase that involves getting the warehouse ready for real time data source for BI and ML operations using Kafka/kinesis.
- Used SCT to build and analyze the RDS compatibility.
- Used DMS to migrate the Source Oracle DB on prem to RDS
- Set replication to sync the on-prem database.
- Involved into all phase starting from planning to implementation for data migration to RDS.
- Involved in team to build perspective around different approach on migration from lift and Shift (re-host) to re-architect.
- Worked with business owners/analysts to understand requirement, designing ETL/Data lake architecture and data model. Layout strategy for all phases, starting from requirement gathering to deployment and postproduction support.
- Identifying the business scenarios for doing Scale-up in Snowflake for best performance of clusters.
- Involved in Data migration from on premises to Cloud Datawarehouse Snowflake.
- Work on File Format, external stages in the snowflake DB to pull the data from S3 buckets also enabled SQS to automate the snow pipe process to load the files onto snowflake stages as soon as the file is loaded on to S3 bucket.
- Strategy to Load data from the different source (csv, json, Oracle, MySQL) to snowflake DW and created tasks to run the process.
- Created file formats, snow pipes and stages and SQS/events on S3 for continuous data loads.
- Zero Copy cloning - Cloning Objects in the account (one account to another account)
- Setting up data pipelines from MS SQL, Oracle Systems to Snowflake
- Study Source and Target system, and nature of data load/Extract, identify ETL connection requirements, privileges, Access Requirements for source/target systems to perform the task.
- Architecture the ETL process flow document based on the requirement and create/design ETL solution as per requirement. (highlighting Data lift strategy, staging requirement if any, Exception Handling, Performance, reusability, staging requirement)
- Design and Create temporary table structure, Data Seed requirement, Error/data rejection strategy.
- Design and create SCD Type-2 and Type-1 load to process delta from Oracle source and Oracle Target creating Develop ETL Informatica mappings (Involving Joiner, LookupExpression, Router, ranker, SQ, Filter and Sorter transformations, Update Strategy) to transform data to Target.
- Code the PL-SQL Procedures and triggers to initiate and load the data to certain dimensions as per business needs.
- Code data validation PL-SQL procedures implementing the validation logic to perform post load validation using MINUS queries, writing the results to validation logs.
- Define the Base objects, Staging tables, foreign key relationships, static lookups, dynamic lookups, queries, packages and query groups. Configured Address Doctor which can cleanse whole world address Data and enhanced by making some modifications during installation.
- Use Hierarchies tool for configuring entity base objects, entity types, relationship base objects, relationship types, profiles, put and display packages and used the entity types as subject areas in IDD.
- Coordinate with the test team to Support the system Integration testing, debug and fix the defects reports in SIT Environment
Confidential, St Louis, MO
Integration Lead
Responsibilities:
- Involve and create Data Model defining facts and dimensions.
- Study Source and Target system, and nature of data load/Extract, identify ETL connection requirements, privileges, Access Requirements for source/target systems to perform the task.
- Create/design ETL solution document mapping the entire extract transformation and load approach
- Design Data Seed requirement, Error/data rejection strategy.
- Design and create SCD Type1/2 and hybrid load to process delta.
- Develop ETL Informatica mappings (XML parser, XML Generator, PLSQL transformation, SQL transformation, Joiner, Lookup, Expression, Router, ranker, SQ, Filter and Sorter transformations, Update Strategy) to transform data to Target.
- Write Shell scripts, SQLLoader control files, and SQL Execute scripts to execute the Informatica PowerCenter workflow.
- Write wrapper function to perform most of the connection, error handling task.
- Write Shell scripts to parse metadata and intelligently create and load the target system, create automated queries via the application.
- Create DDL and DML scripts to create table, Alter Indexes, Grant
- Privileges) to implement the data Model in all the environments.
- Schedule jobs using Autosys JIL, dependencies, triggers, file watchers.
- Create Informatica Data Validation Objects and schedule the jobs post load for validation.
- Perform Unit testing and review the UTR with review panel. And support SIT post deployment
- Participate in data governance meetings and Suggest improvements in Informatica IMM model and review data lineage
Confidential, NY, NY
ETL Lead
Responsibilities:
- Work Delegation and Lead, Mentor and coordination with Offshore team
- Build ETL approach and process flow, Data lift strategy, Profiling, Staging Exception Handling, Performance, reusability, Incremental Strategy, data Seed strategy for downstream Web services and Business to profile and Data Quality (DQ), Source to target mapping document
- Extract, transform and Integrate Organization/DEBT/RATING from various Databases/schemas (involved Golden Source database) and build Hierarchies de normalizing the core data..
- Create ETL Informatica mappings (Involving Joiner, LKP, Expression, Router, ranker, SQ, Filter and Sorter transformations, Update Strategy) to transform data to Target, Refactor/retrofit iteratively making the ETL and Agile ETL.
- Build delta approach and Unit test the ETL built in development environment before moving UAT. Followed Test cycle and support.
Environment: Informatica Power Center 10x, Informatica Power exchange, Oracle 11g, MS SQL, Unix Shell Scripting (PUTTY/WINSCP), Web Services, XML, VISIO, AUTOSYS Scheduler, HP ALM Quality Center.
Confidential, St Louis, MO
ETL Lead
Responsibilities:
- Following complete SDLC process, Designing Incremental strategy, Reusability, Error handling, Bottleneck analysis and
- Created Complex ETL Mappings (CD, Both Type 1 & 2). to load data using transformations like Source Qualifier, Sorter, Aggregator, Expression, Joiner, Dynamic Lookup, and Connected and unconnected lookups, Filters, Sequence, XML Parser, Router and Update Strategy. Identified the bottlenecks in the sources, targets, mappings, sessions and resolved the problems. Implemented
- Capture Metadata, Capacity Plan and Performance metrics for all the queries being created and Define archival strategy and provide guidance for performance tuning.
- Some of the other duties involved: performance tuning to reduce the total ETL process time, Conduct and Review Unit Testing and System Integrated Testing and coordinating with Business on UAT. (Different environments - Dev and Clone)
- Involved with a Pilot Project to see and analyze data migrations to and from Hadoop (Hive).
- Used Informatica power exchange and Informatica developer (using both methods for comparison) to establish connection to Hadoop Hive connector.
- Moved the data to and from (Source and Target) different nodes from data on files (from Legacy systems).
- Used Hive to add structure to datasets stored in HDFS (Hadoop). - PILOT PROJECT
Confidential
Project Lead
Responsibilities:
- Responsible for onshore and offshore coordination, and assigning and tracking the task with Offshore, resolving issues by coordinating Client/Architects/Business and Team, bridging gaps to ensure delivery as expected per scope
- Develop Physical Data Model for ETL Source (DB2/FF/VSAM/XML) to Target mapping and the process flow diagrams for all the business functions.
- Converted the data mart from Logical design to Physical design, defined data types, Constraints, Indexes, generated Schema in the Database, created Automated scripts, defined storage parameters for the objects in the Database.
- Setting Power Exchange connectionsto database and Legacy files.
- Develop Mappings (using Unconnected Lookup, Sorter, XML Parser, Aggregator, newly changed dynamic Lookup and Router transformations), session and workflows to transfer data from multiple sources to staging and then from staging to data warehouse.
- Working on best approach to get incremental/delta data from source to load into staging area.
- Work with developers, DBAs, and systems support personnel in elevating and automating successful code to production.
Environment: Informatica Power Center 9.6.1/Power Exchange, ERwin Oracle 11g, Flat files, Sequential files, SQL, PL/SQL, SPUFI, Platinum, SQL*Plus, IDMS, DB2, VSAM, Flat Files, XML, UNIX (WINSCP).
Confidential
Project Lead
Responsibilities:
- Extensively used MS SQL Server Integrator for extracting, transforming, and loading databases from sources including Oracle, DB2, and Flat files. Collaborated with EDW team in, High Level design documents for extract, transform, validate, and load ETL process data dictionaries, Metadata descriptions, file layouts, and flow diagrams.
- Created SSIS packages to transfer data from multiple sources to staging and then from staging to data warehouse
- Developed complex reports using multiple data providers, user defined objects, aggregate aware objects, charts, and synchronized queries with SSRS.
- Created multiple dashboards with QlikView for analytics, which provides drill down, drill up, slice and dicing options to end user on dashboards.
Environment: SSIS 2008 R2, SQL Server 2008 R2, QlikViewMIS - Transformation (City Century) ( ), Pune - India
Confidential
Project Lead/Scrum Master/ETL Lead
Responsibilities:
- Independently handled the multi-platform and multi-location project (Team size 20).
- Involved with DWH team to create the warehouse to input the MST development.
- Involved with data Modelers to create a Data model/DWH design to trigger the ETL development process.
- Involved in ETL Design (Informatica Power Centre) Implementing SCD1 and SCD2, Load Plans, various transactions like Union Derived Column, Aggregator, Lookup, Sorter etc.
- Involved in Dashboard design and architecting visualization using MicroStrategy
- Project Management and coordination activities:
- Managing and Coordinating with team at different locations driving to technical consensus. Coordination with various application teams on Citi on knowledge transfer activities and facilitate calls/meetings if required.
- Preparing the Project Plan-MPP, Responsible for project (Sprint)/resource planning.
- Setting up the dependencies and critical path for the project.
- Assigning resources to various tasks and ensuring the timelines according to the plan.
- Project being agile in nature: Rigorous tracking/monitoring of all the tasks, Risks and dependencies, Track Impediment & Action log
Environment: Informatica Power Center 9x, DB2, MPP, MicroStrategy, Core Java, Erwin, SQL +,Unix.