We provide IT Staff Augmentation Services!

Big Data Architect Resume

0/5 (Submit Your Rating)

San Antonio, TX

SUMMARY

  • Over 10+ years of experience in IT industry having extensively involved in analysis and design, testing, implementation, maintenance, support and knowledge transfer.
  • Expertise in Linux, SQL&PL/SQL, Hadoop, HDFS, Hive, Spark, Storm and Kafka.
  • Experienced on working with Big Data and Hadoop File System (HDFS).
  • Strong Knowledge of Hadoop and Hive and Hive's analytical functions.
  • Experience in working with MapReduce programs using Apache Hadoop for working with Big Data.
  • Worked on AWS components like EC2, S3, RedShift, EC2 Container Service, Elastic File System, Lambda, RDS and also other storage systems.
  • Work with development teams to integrate their applications into the AWS production environment and support their requirements
  • Worked on NoSQL databases including Cassandra.
  • Strong DWH & ETL experience in Informatica Power Center 9.1/8.X/7.x/6.x development, design, process and best practices experience as ETL Analyst and Developer.
  • Proven experience with ETL the development, and implementation of large - scale, enterprise data warehouse and reporting solutions.
  • Experienced in all phases of SDLC (software development life cycle) processes.
  • Documented business requirements, discussed issues to be resolved and translated user input into ETL design documents
  • Involved in ETL/ELT process from development to testing and production environments.
  • Used Power Exchange CDC to captures changes in the source system and loads those changes into the customer dimension.
  • Used Informatica Data Quality tool to analyze the data from new customers in the initial phase of the ETL process build.
  • Expertise in Tuning the DW environment including Oracle SQL query tuning, Database tuning and in Informatica Power center to identify bottlenecks, tune and remove bottlenecks. Query Rewrite, tuning Mappings and Sessions.
  • Design Source to Confidential maps, Code Migration, Version control, scheduling tools, Auditing, shared folders, data movement, naming in accordance with ETL Best Practices, Standards and Procedures.
  • Responsibilities include managing the deliverables and activities like UAT and coordinating with other groups like data center, Production/ change Management, etc. to implement various releases.
  • Extensive database experience and highly skilled in SQL Server, Oracle, XML Files, Flat Files, MS Access
  • Strong trouble-shooting, problem solving, analytical and design skills.
  • Experience in Dimensional Data Modeling using Star and Snow Flake Schema.
  • Extensively worked on ETL Informatica Transformations.
  • Expertise in Developing PL/SQL Packages, Stored Procedures/Functions, triggers.
  • Strong documentation skill using Microsoft Word and Excel
  • Effective communication, professional attitude, strong work ethic and interpersonal skills

TECHNICAL SKILLS

Data Warehousing/ ETL: Informatica Power Center 9.1. Power Exchange, Talend, Oracle Warehouse Builder, Informatica TDM

Big Data Technologies: Storm, Kafka, Hadoop, Sqoop,, Hive, Spark, Cassandra, Solr

Reporting Tools: Business Objects and SSRS, OBIEE 11g

Databases & Tools: Oracle 10g/9i/8i, DB2, SQL Server 2010/2005, Teradata, Vertica, Netezza

Scheduling Tools: Tidal

Programming Languages: Unix Shell Scripting, SQL, PL/SQL, Perl, Visual Basic

Web Tools: HTML, ASP, JavaScript, .NET

Environment: UNIX, Win XP/NT 4.0

PROFESSIONAL EXPERIENCE

Confidential, San Antonio, TX

Big Data Architect

Responsibilities:

  • Design Hive data models to convert from DB2 into Hortonworks platform.
  • Designed sqoop imports into HDFS.

Confidential, Irving, TX

Hadoop/ ETL Architect

Responsibilities:

  • Devised and implemented of the next generation architecture for more efficient data ingestion and processing.
  • Designed Storm topologies in to read data from MQ and Kafka to load into Data ware house and Cassandra in real time.
  • Designed/Developed Spark to preprocess the data for analysis.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Vertica into HDFS using Sqoop.
  • Analyzed the data by performing Hive queries and running Spark to know user behavior.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Created ETL to load data from multiple data sources into Data ware house and DataMart.
  • Build ETL load mechanisms using tools like Kafka, Storm and other open source technologies.
  • Developed Hive queries for the analysts.
  • Performed data migration from Netezza to Vertica database.
  • Rewrite/Modify the existing shell scripts for ongoing data loads.
  • Created updated and maintained ETL technical documentation.
  • Replicated/Modified the data model in Vertica.
  • Data comparisons between legacy and new data marts.
  • Lead the team of ETL developers in building the data marts.
  • Working along with a team of report developers to understand the data loaded into DataMart.

Environment: Hadoop, AWS, HDFS, Hive, Java, SQL, Storm, Kafka, Pig, Sqoop, Oozie, Vertica, Business Objects, UNIX Shell Scripting, Cassandra

Confidential, San Antonio, TX

ETL Architect

Responsibilities:

  • Designed ETL Framework to maintain/monitor enterprise level ETL processes.
  • Designs and develops test plans for ETL unit testing and integration testing using Informatica TDM.
  • Used PCI,PII policies to achieve data masking while copying data from prod to test using TDM
  • Provisioned datasets based on test case scenarios
  • Created synthetic data to fill production data gap identified for testing.
  • Maintaining/creating ETL standards across different project,
  • Created data model in store the metadata related to ETL framework.
  • Created Validation processes to validate the ETL process data.
  • Reconciliation tools are developed to compare data from sources to targets
  • Involved in the agile software development model.
  • Lead the team of ETL developers in building the data marts.
  • Created updated and maintained ETL technical documentation.

Environment: Informatica Power Center, Power Exchange,TDM, SQL Server, Oracle10g,, DB2, PL/SQL, UNIX Shell Scripting, Erwin 4.5, Microsoft Visio, SQL Plus, TOAD, HDFS, Hive, Pig.

Confidential, Charlotte, NC

Sr. Informatica Developer

Responsibilities:

  • Designed and developed ETL Mappings using Informatica to extract data from DB2, SQL server and Oracle, and to load the data into Data Warehouse.
  • Used Informatica schedulers to automate the process.
  • Converted legacy ETL code in BO to Informatica.
  • Actively involved in SIT/UAT testing, bug fixing and answering business related queries.
  • Trained users to use Informatica tool to monitor, troubleshoot and change existing code to accommodate new business rules.
  • Software Development Life Cycle (SDLC) including analysis, design, implementation, support and maintenance.
  • Extensively worked on Mapping Variables, Mapping Parameters, Workflow Variables and Session Parameters.
  • Created complex Informatica mappings to load data and monitored them.
  • Used heterogeneous files from Oracle, Flat Files, DB2, SQL server as source and imported stored procedures from oracle for transformations.
  • Created ETL processes to extract data from Mainframe DB2 tables for loading into various oracle-staging tables.
  • Created updated and maintained ETL technical documentation and preparing FSD.

Environment: Informatica Power Center, Power Exchange, Business Objects Integrator, SQL Server, Oracle10g, Teradata, DB2, PL/SQL, BTEQ, UNIX Shell Scripting, Erwin 4.5, Microsoft Visio, SQL Plus, TOAD

Confidential, Dallas, TX

Sr. Informatica Developer

Responsibilities:

  • Gathering user requirements and source system analysis and establishing mappings between sources to Confidential attributes.
  • Coordinated and managed team members for enterprise level project deliverables.
  • Parsed high-level design spec to simple ETL coding and mapping standards.
  • Designed and developed ETL Mappings using Informatica to extract data from flat files and Oracle, and to load the data into the Confidential database.
  • Identifying the sensitive data locations for consistent masking across the databases.
  • Developed programs to provide conditioned data for User testing and QA teams.
  • Loading historical data effectively for reporting purposes.
  • Coordinated with offshore groups in implementing projects.
  • Created different source definitions to extract data from flat files and relational databases.
  • Analyzed and fine-tuned PL/SQL scripts.
  • Involved in the ODS design using Power Exchange Change data capture (CDC) ETL process
  • Used hints effectively to improve process performance.
  • Event waits were implemented to coordinate the ETL loads from different applications.
  • Extensively worked on Mapping Variables, Mapping Parameters, Workflow Variables and Session Parameters.
  • Created complex Informatica mappings to load data and monitored them. The mappings involved extensive use of transformations like Aggregator, Filter, Router, Expression, Joiner, Union, Normalizer and Sequence generator.
  • Used heterogeneous files from Oracle, Flat Files, XML, SQL server as source and imported stored procedures from oracle for transformations.
  • Created ETL processes to extract data from Mainframe DB2 tables for loading into various oracle-staging tables.
  • Configured the mappings to handle the updates to preserve the existing records using Update Strategy Transformation.
  • Defined Confidential Load Order Plan and Constraint based loading for loading data correctly into different Confidential Tables.
  • Used XML parser, generator and HTTP transformations for the web service requests and responses.
  • Used PMCMD/PMREP commands for running Informatica from backend.
  • Used predefined shell scripts to run jobs and data loads.
  • Created updated and maintained ETL technical documentation and preparing FSD.

Environment: Informatica Power Center, Power Exchange, SQL Server, Oracle10g, Teradata, DB2, PL/SQL, BTEQ, UNIX Shell Scripting, Erwin 4.5, Windows XP, Microsoft Visio, SQL Plus, TOAD, SQL Developer

Confidential, TX

Informatica Developer/Analyst

Responsibilities:

  • Gathered requirements from business analysts for designing and development of the system
  • Developed Transformation logic and designed various complex Mappings in the Designer for data load and data cleansing.
  • Assessed data to understand its quality challenges using data profiler
  • Created Mappings using Mapping Designer to load the data from various sources, using different transformations like Source Qualifier, Expression, Lookup (Connected and Unconnected), Aggregator, Update Strategy, Joiner, Filter, and Sorter transformations.
  • Used effective methods to update the data in large tables.
  • Using Informatica PowerCenter Designer analyzed the source data to Extract & Transform from various source systems (oracle 10g,Teradata and flat files) by incorporating business rules using different objects and functions that the tool supports.
  • Implemented Slowly Changing Dimension Type 1 and Type 2 for inserting and updating Confidential tables for maintaining the history.
  • Extensively used the capabilities of Power Center such as File List, pmcmd, Confidential Load Order, Constraint Based Loading, Concurrent Lookup Caches etc.
  • Created and Monitored Workflows using Workflow Manager and Workflow Monitor.
  • Worked on multiple RDBMS like Teradata, Oracle, Sql Server to import and integrate source data.
  • Extensively worked on Mapping Variables, Mapping Parameters, Workflow Variables and Session Parameters.
  • Used shortcuts (Global/Local) to reuse objects without creating multiple objects in the repository and inherit changes made to the source automatically.
  • Created post-session and pre-session shell scripts and mail-notifications.
  • Used debugger to test the mapping and fixed the bugs.
  • Developed Unit Test Cases to ensure successful execution of the data loading processes.
  • Used Shell Scripts & Perl scripts for better handling of incoming source files such as moving files from one directory to another directory and extracting information from the log files on Linux server.
  • Documented Informatica mappings, design and validation rules

Environment: Informatica Power Center 8.6.1, Power Exchange, Teradata, Oracle 11/10g, ERWIN 4.1, SQL, PL/SQL, Toad, Cron, data explorer, SQL Server 2005, Perl 5, Unix, Windows 2003 Server, Tidal

Confidential

ETL Developer

Responsibilities:

  • Responsible for requirement gathering by direct interaction with users.
  • Responsible for Data Staging Design, Development and Deployment
  • Developing a proper structure to accommodate the data mart.
  • Designing star schema, fact table, dimension table and hierarchies.
  • Created various active and passive transformations to support voluminous loading of data Confidential table using Informatica PowerCenter
  • Created Catalog based on the defined specifications for different users using Cognos Impromptu.
  • Determining the folder structure to reflect the business terms.

Environment: Oracle 9i, Informatica Power Center, Unix

We'd love your feedback!