We provide IT Staff Augmentation Services!

Big Data - Hadoop Developer/ Etl Lead Developer & Data Architect Resume

5.00/5 (Submit Your Rating)

KS

PROFESSIONAL SUMMARY:

  • Over 10+ years in ETL development and with Data Analyst (Data Warehouse Implementation/development) for Retail, Health care and Banking.
  • 2 years of work experience as Hadoop Developer with good knowledge of Hadoop framework, Hadoop distributed file system and parallel processing implementation.
  • Experience in Hadoop Ecosystems HDFS, Map Reduce, Hive, Pig, HBase, Sqoop, AWS
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP.
  • In depth understanding of Data Structures and Algorithms.
  • Strong experience in writing Map Reduce programs for Data Analysis. Hands on experience in writing custom practitioners for Map Reduce.
  • Performed data analysis using Hive and Pig.
  • Having complete SDLC experience and deployed critical processes within timelines.
  • Very good experience in Production support to handle more than 3 projects in parallel.
  • Knowledge about Software Development Lifecycle (SDLC), Agile, Application Maintenance Change Process (AMCP).
  • Expertise on Mapping Designer, Workflow Manager, Repository Manager and Workflow Monitor. Experienced in overall Data Warehouse, Database, ETL and performance tuning.
  • Involved in preparing ETL mapping specification documents and Transformation rules for the mapping.
  • Implemented various ETL solutions as per the business requirement using Informatica 9.x /8.x
  • Extensively worked on Informatica Power Center Transformations such as Source Qualifier, Lookup, Filter, Expression, Router, Normalizer, Joiner, Update Strategy, Rank, Aggregator, Stored Procedure, Sorter, Sequence Generator etc...
  • Experienced in defining data quality, data modeling, data warehouse, Star/Snowflake Schema Designs with requirements analysis - definition, database design, testing and implementation, and Quality process.
  • In-depth knowledge and understanding of dimensional modeling (Star schema and Snowflake schema, SCD types- 1, 2, 3), and data modeling (star schema) at logical and physical level.
  • Involved in Performance tuning of the informatica mappings, sessions and SQL’s.
  • Expertise in integration of various data source definitions like Netezza, SQL Server, Oracle, Flat Files, Excel, SAP, and XML.
  • Experienced with Metadata Manager to generate Lineage and load MM resources .
  • Extensively worked with SQL loader for bulk loading into the Netezza database.
  • Experience in complete life- cycle of test case design, test plans, test execution, defect management.
  • Experience in UNIX Shell and Perl Scripting.
  • Excellent communication and interpersonal skills, ability to learn quickly, good analytical reasoning and high adaptability to new technologies and tools.
  • Team player with good interpersonal and problem solving skills, ability to work in team and work independently.

TECHNICAL SKILLS:

ETL Tools: Informatica Power Center 10.1/9.6.1/9.1/8.6.1 , Power Exchange and Power Connect

Databases: Netezza, Oracle 11g/10g/9i, SQL Server 2012/2008, and Teradata.

BI Reporting Tools: Oracle Business Intelligence (OBIEE 11.1.1.5/11. X, OBIEE 10.1.3.4.1/10. x), Dashboards, BI Publisher, BI Administration.

Methodologies: Ralph Kimball dimensional Modeling, Data warehousing

Bigdata, Hadoop, AWS, Azure: Hadoop, Hive, Pig, Spark, Flume, Sqoop, Hbase, Kafka, Redshift, Azure PDW

Cloud Computing: AWS, S3, EC2, VPC, EBS, Snowball, Load balancer, Auto Scaling, RDS Redshift, SQS, SQN, Lambda

Languages: SQL, PL/SQL, HTML, DHTML, XML, UNIX Shell Script, Power shell.

Data Modeling Tools: MS Office Suite, MS project, MS Visio, Model Right (Data Modeler Tool), Erwin Data Modeler 4.1, UML

Web: Oracle Apex, Microsoft Front-page, HTML, DHTML and XML.

Operating Systems: Linux (RHEL), Windows 8/7/NT, UNIX

Multimedia Tools: Adobe Photoshop and Dreamweaver.

PROFESSIONAL EXPERIENCE:

Confidential, KS

Big Data - Hadoop developer/ ETL Lead Developer & Data Architect

Responsibilities:

  • Responsible for Technical Lead & Business Analyst and Hadoop developer.
  • Evaluate business requirements and prepare detailed specifications that follow project guidelines required to develop written programs.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyze large amounts of data sets to determine optimal way to aggregate and report on it.
  • Develop simple to complex MapReduce Jobs using Hive to cleanse and load downstream data’s
  • Handle importing of data from various data sources, perform transformations using Hive, MapReduce, load data into HDFS and extract the data from MySQL into HDFS using Sqoop.
  • Export the analyzed data from hive tables to SQL databases using Sqoop for visualization and to generate reports for the BI team. • Extensively used Hive for data cleansing.
  • Create partitioned tables in Hive and Manage and review Hadoop log files.
  • Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in MapReduce way.
  • Use Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Used Unix bash scripts to validate the files from Unix to HDFS file systems.
  • Load and transform large sets of structured, semi structured and unstructured data and Manage data coming from different sources.
  • Worked on Informatica Power Center tools- Designer, Repository Manager, Workflow Manager and Workflow Monitor.
  • Parsed high-level design specification to simple ETL coding and mapping standards.
  • Designed and customized data models for Data warehouse supporting data from multiple sources on real time
  • Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse.
  • Created mapping documents to outline data flow from sources to targets.
  • Involved in Dimensional modeling (Star Schema) of the Data warehouse and used Erwin to design the business process, dimensions and measured facts.
  • Extracted the data from the flat files and other RDBMS databases into staging area and populated onto Data warehouse.
  • Maintained stored definitions, transformation rules and targets definitions using Informatica repository Manager.
  • Used various transformations like Filter, Expression, Sequence Generator, Update Strategy, Joiner, Stored Procedure, and Union to develop robust mappings in the Informatica Designer.
  • Developed mapping parameters and variables to support SQL override.
  • Created mapplets to use them in different mappings.
  • Developed mappings to load into staging tables and then to Dimensions and Facts.
  • Used existing ETL standards to develop these mappings.
  • Worked on different tasks in Workflows like sessions, events raise, event wait, decision, e-mail, command, worklets, Assignment, Timer and scheduling of the workflow.
  • Created sessions, configured workflows to extract data from various sources, transformed data, and loading into data warehouse.
  • Used Type 1 SCD and Type 2 SCD mappings to update slowly Changing Dimension Tables.
  • Extensively used SQL* loader to load data from flat files to the database tables in Oracle.
  • Modified existing mappings for enhancements of new business requirements.
  • Used Debugger to test the mappings and fixed the bugs.
  • Wrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and backup of repository and folder.
  • Involved in Performance tuning at source, target, mappings, sessions, and system levels.
  • Prepared migration document to move the mappings from development to testing and then to production repositories.

Environment: Informatica Power Center 10.1, Netezza, Oracle 11gr1, AIX 6.1/ Linux RHEL 7.x, SQL SERVER 2016/2014, SQL Navigator, SQL, PL/SQL, SAP, Autosys.

Confidential, KS

Informatica ETL Lead Developer/ Data analyst & Architect

Responsibilities:

  • Involved in the Technical business analysis of the user requirements and identifying the sources.
  • Created technical specification documents based on the requirements by using S2T Documents.
  • Involved in the preparation of High level design documents and Low level design documents.
  • Involved in Design, analysis, Implementation, Testing and support of ETL processes for Stage, ODS and Mart.
  • Prepared ETL standards, naming conventions and wrote ETL flow documentation for Stage, ODS and Mart.
  • Followed Ralph Kimball approach (Bottom Up Data Warehouse Methodology in which individual data marts like Shipment Data Mart, Job order Cost Mart, Net Contribution Mart, Detention & Demurrage Mart are providing the views into organizational data and later combined into Management Information System (MIS)).
  • Prepared Level 2 Update plan to assign work to team members. This plan is very helpful to know the status of each task.
  • Designed and developed Informatica Mappings and Sessions based on business user requirements and business rules to load data from source flat files and oracle tables to target tables.
  • Worked on various kinds of transformations like Expression, Aggregator, Stored Procedure, Lookup, Filter, Joiner, Rank, Router and Update Strategy.
  • Developed reusable Mapplets and Transformations.
  • Used debugger to debug mappings to gain troubleshooting information about data and error conditions.
  • Involved in monitoring the workflows and in optimizing the load times.
  • Used Change Data Capture (CDC) to simplify ETL in data warehouse applications.
  • Involved in writing procedures, functions in PL/SQL.
  • Worked with SQL*Loader tool to load the bulk data into Database.
  • Prepared UNIX Shell Scripts and these shell scripts will be scheduled in AUTOSYS for automatic execution at the specific timings.
  • Rational Clear case is used to Controlling versions of all files & Folders (Check-out, Check-in)
  • Prepared test Scenarios and Test cases in HP Quality Center and involved in unit testing of mappings, system testing and user acceptance testing.
  • Defect Tracking and reports are done by Rational Clear Quest

Environment: Informatica Power Center 9.6, Flat files, Oracle 11g, SQL Server 2014/12, Windows 7/NT, UNIX/Linux, Autosys and Netezza.

Confidential

Senior ETL Developer & Data Analyst

Responsibilities:

  • Data analysis for the complete project life cycle and development.
  • Interacted with product owners & DBA teams to design the project for ETL process.
  • Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Informatica Power Center
  • Experience in integration of heterogeneous data sources like Oracle, DB2, SQL Server and Flat Files (Fixed & delimited) into Staging Area.
  • Wrote SQL-Overrides and used filter conditions in source qualifier thereby improving the performance of the mapping.
  • Designed and developed mappings using Source Qualifier, Expression, Lookup, Router, Aggregator, Filter, Sequence Generator, Stored Procedure, Update Strategy, joiner and Rank transformations.
  • Managed the Metadata associated with the ETL processes used to populate the Data Warehouse.
  • Used debugger to validate the mappings and gain troubleshooting information about data and error conditions.
  • Extensively used UNIX Scripting, Scheduled PMCMD and PMREP to interact with Informatica Server from command mode.
  • Implemented performance tuning techniques by identifying and resolving the bottlenecks in source, target, transformations, mappings and sessions to improve performance.
  • Troubles shoot the Productions failure and provide root cause analysis. Worked on emergency code fixes to Production.

Environment: Informatica Power Center 9.6, Netezza, Oracle 11gr1, AIX 6.1/ Linux RHEL 7.x, SQL SERVER 2016/2014, SQL Navigator, SQL, PL/SQL, SAP, Autosys.

Confidential

ETL/Informatica Developer

Responsibilities:

  • Documents all technical and system specifications documents for all ETL processes and perform unit tests on all processes and prepare required programs and scripts.
  • Provide technical knowledge of Extract/Transform/Load (ETL) solutions for Business Intelligence projects
  • Work closely with project Business Analyst, Data Modeler and BI Lead to ensure that the end to end designs meet the business and data requirements
  • Develop the task plan for ETL developers for a specific project
  • Ensure the ETL code delivered is running, conforms to specifications and design guidelines
  • Proficient with a database SQL language for user defined database extract or update statements. SQL and database skills are the two great building blocks for ETL work
  • Understands the range of options and best practices for common ETL design
  • techniques such as change data capture, key generation and optimization, developing the mappings, sessions and workflows with by considering all dependencies and maintaining all project standard.
  • Performed Data Profiling, Data Quality and Used Erwin for data modeling.
  • Extensively involved in Data Extraction, Transformation and Loading (ETL process) from Source to target systems using Informatica Power Center
  • Owned the assigned reports, worked on them and updated the Report Development Scheduler for status on each report.
  • Responsible for determining the bottlenecks and fixing the bottlenecks with performance tuning.
  • Analyzed business process workflows and developed ETL procedures to move data from various source systems to target systems

Environment: Informatica Power Center 9.0.1/8.6.1 , Netezza, Oracle 9i, Toad, Control-M, UNIX, Flat Files.

Confidential

ETL/ Informatica Developer

Responsibilities:

  • Analyzed the requirements and prepared the initial level TSD.
  • Designing and Building the structure of the maps and workflows.
  • Developed the ETL processes using Informatica tool to load data from text file into the Staging area.
  • Developed mappings/sessions using Informatica 8.6.1 for data loading.
  • Developed Bteq scripts to load data from staging to Landing zone.
  • Created various Transformations like SQ, aggregator, Joiner, Expression, Lookup, Sorter.
  • Loaded data through Bteq scripts from CSA to Edward area.
  • Extensively used Toad utility for executing SQL scripts and worked on SQL for enhancing the performance of the conversion mappings
  • Created Test cases for the mappings developed and then created integration Testing Document.
  • Prepared the error handling document to maintain the error handling process.
  • Automated the Informatica jobs using UNIX shell scripting
  • Involved in analysis and data validations

Environment: Informatica Power Center 8.6.1, Teradata, VA Secure FX SSB, UNIX, Flat Files.

Confidential

ETL/ Informatica Developer

Responsibilities:

  • Providing solutions after understanding the requirements
  • Designing and Building the structure of the maps and workflows.
  • Developed the ETL processes using Informatica tool to load data from Teradata into the target flat file.
  • Extensively used SQL scripts/queries for data verification at the backend.
  • Executed SQL queries, stored procedures and performed data validation as a part of backend testing.
  • Used SQL to test various reports and ETL Jobs load in development, testing and production
  • Developed mappings/sessions using Informatica 8.6.1 for data loading. Developed mapping to load the data in Flat File.
  • Created various Transformations like SQ, aggregator, Joiner, Expression, Lookup, Sorter.
  • Generated the extracts from the SSB DW and the reports would be FTP to the business Location as pipe separated flat files.
  • Involved in mapping level Testing and Prepared UTC for every module.

Environment: Informatica Power Center 8.6.1, Teradata, VA Secure FX SSB, UNIX, Flat Files.

Confidential

ETL/ Informatica Developer

Responsibilities:

  • Responsible for designing and preparing LLDs and Technical Design Documents for client, offshore team which includes all transformation level detail.
  • Developing the mappings, sessions and workflows with by considering all dependencies and maintaining all project standard.
  • Performed Data Profiling, Data Quality and Used Erwin for data modeling.
  • Extensively involved in Data Extraction, Transformation and Loading (ETL process) from Source to target systems using Informatica Power Center
  • Owned the assigned reports, worked on them and updated the Report Development Scheduler for status on each report.
  • Responsible for determining the bottlenecks and fixing the bottlenecks with performance tuning.
  • Analyzed business process workflows and developed ETL procedures to move data from various source systems to target systems

Environment: Informatica Power Center 9.0.1/8.6.1 , Netezza, Oracle 9i, Toad, Control-M, UNIX, Flat Files.

Confidential

Software Engineer

Responsibilities:

  • Extensively involved in requirements gathering, writing ETL Specs and preparing design documents
  • Designed and developed Informatica mappings for data sharing between interfaces utilizing SCD type 2 and CDC methodologies
  • Fixed various performance bottle-necks involving huge data sets by utilizing Informatica's partitioning, pushdown optimizations and SQL overrides
  • Worked on parameters, variables, procedures, scheduling and pre/post session shell scripts
  • Built sample Microstrategy reports to validate BI requirements and loaded data
  • Designed migration plan and cutover documents; created and monitored Informatica batches
  • Worked on requirement traceability matrix, provided support for integration and user acceptance testing

Environment: Informatica Power Center 8.1, Oracle 8, SQL Developer, UNIX, HP Quality Center.

We'd love your feedback!