Lead Talend Big Data Developer Resume
Farmington, CT
PROFESSIONAL SUMMARY:
- 10+years of professional experience in warehouse project with ETL Talend(DI,DQ,MDM,ESB,Data Mapper and Big Data).
- Hands on experience in Talend Big Data with MDM for creating data model, data container, views and workflows.Used different components in talend MDM like tMDMInput,tMDMOutput, tMDMBulkLoad,tMDMConnection,tMDMReceive and tMDMRollback.
- Created complex mappings in Talend 6.4.1 using tMap, tJoin, tReplicate, tParallelize, tJava, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher, etc.
- Talend administrator with hands on Big data (Hadoop) with Cloudera framework (HUE) .
- Good experience in Talend admin(Users,Usergroups,project,project authorization, project reference,locks,Licenses,backup,notification,softwareupdates,jobconductor,bigdata streaming,execution plan,server,Monitoring,logging,Activity Monitoring console,audit,drools and migration check)
- Job scheduling plan base on business requirement in TAC(simple trigger, corn trigger and file trigger)to change context parameters and JVM parameters.
- Created AMC table and file for LOG, STAT and FLOW to capture job information.
- Expert knowledge in creating test Plans, test Cases, test Scenarios and test Strategies, Defect Management to ensure Quality Assurance and to test all the business requirements.
- Experience in preparing test reports from Quality Center and prepared daily test status reports to communicate the test status with the team.
- Experienced in architecting Hadoop clusters Translation of functional and technical requirements into detailed architecture and design.
- Worked on analyzing Hadoop cluster and different big data analytical and processing tools including Pig, Hive,Spark, and Spark Streaming.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Migrating various Hive UDF's and queries into Spark SQL for faster requests.
- Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to Confidential using Scala.
- Hands on experience in Spark and Spark Streaming creating RDD's, applying operations - Transformation and Actions.
- Developed and implemented hive custom UDFs involving date functions.
- Used sqoop to import data from Oracle to Hadoop.
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce Hive, Pig, and Sqoop.
- Experienced in developing scripts for doing transformations using Scala.
- Involved in developing Shell scripts to orchestrate execution of all other scripts and move the data files within and outside of Confidential .
- Installed and configured Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
- Using Kafka on publish-subscribe messaging as a distributed commit log, have experienced in its fast, scalable and durability.
- Used Oozie to orchestrate the map reduce jobs that extract the data on a timely manner.
- Used tStatsCatcher, tDie, tLogRow to create a generic joblet to store processing stats.
- Created Talend Mappings to populate the data into dimensions and fact tables.
- Broad design, development and testing experience with Talend Integration Suite &Talend MDM and knowledge in Performance Tuning of mappings.
- Proficient in supporting Data warehouse ETL activities using queries and functionalities of SQL, PL/SQL,SQL*Loader,AWS and SQL * Plus. Solid experience in implementing complex business rules by creating re-usable transformations and robust mappings/mapplets using various transformations like Unconnected and Connected lookups, Source Qualifier, Router, Filter, Expression, Aggregator, Joiner, Update Strategy etc.
- Talend Platform Setup on Windows and Linux.
- Experience in working on TalendAdministration Activities and Talend Data Integration ETL Tool.
- Highly experienced in row and column oriented databases and also experienced in SQL performance tuning and debugging of existing ETL process
- Familiar with design and implementation of the Data Warehouse life cycle and excellent knowledge on entity-relationship/multidimensional modeling (star schema, snowflake schema).
- System design architecture, Data ware house design, ingestion
- Analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase and Sqoop. Implemented Kerberos Security Authentication protocol for existing cluster Technology: Spark, Spark Streaming, Kafka, Flume, Hive, Hbase, Scala, Java, Pig, Map Reduce, Zookeeper, Oozie.
- Development of staging, Data warehouse scripts and deployment.
- Experience on MySQL Database and Teradata Administration on Linux and Windows.
- Experience in analysis, design, and development using Amazon Redshift and AWS.
- Extensive expertise in Data Warehousing, ETL design and development process using Talend and Informatica Power center.
- Datamart and Power Center and informatica-IDQ (Source Analyzer, Data Warehousing Designer, Mapping Designer, Mapplet, Transformations).
- Experience on Teradata development and performance tuning.
- Perform validation check and deployment of reports to customer’s staging environment.
- Experience on Design, Installation, Creation, Maintenance of Databases, Performance Tuning.
- Backup and Recovery, Optimization and Database Security.
- Experience working with Healthcare EDI 834 .
- Hands on Experience with Talend Data Integration Etl Tool and map reduce.
- Strong background in row and column oriented data bases and highly experienced in PL/SQL.performance tuning and debugging of existing ETL process
- Experience in troubleshooting and handling production duties
- Experience in Data quality profiling.
SKILL:
Languages: C#, Java, C/ C++, HTML, SQL, PLSQL
Databases: Oracle 11G, SQL Server2008/ 2012, Mongo DB, Amazon
Tools: Talend 6.4.1/5.6, Talend, Informatica 9.1/8.6/8.1/7.1/6.1 Haddop, Big Data, Spark, Confidential, Map Reduce, PIG, HIVE
Operating Systems: Windows, Linux, Solaris
Web Technologies: HTML, CSS, Java Script, Macromedia Dreamweaver
Industry Verticals: Retail, Manufacturing, Healthcare, telecom and finance.
PROFESSIONAL EXPERIENCE:
Confidential, Farmington, CT
Lead Talend Big Data Developer
Responsibilities:
- Acquire and interpret business requirements, create technical artifacts, and determine the most efficient/appropriate solution design, thinking from an enterprise-wide view.
- Worked in the Data Integration Team to perform data and application integration with a goal of moving more data more effectively, efficiently and with high performance to assist in business critical projects coming up with huge data extraction.
- Perform technical analysis, ETL design, development, testing, and deployment of IT solutions as needed by business or IT.
- Worked on analyzing Hadoop cluster and different Big Data Components including Pig, Hive, Spark, HBase, Kafka, Elastic Search, database and SQOOP. Installed Hadoop, Map Reduce, Confidential, and developed multiple Map-Reduce jobs in PIG and Hive for data cleaning and pre-processing.
- Importing and exporting data into Confidential and Hive using SQOOP.
- Participate in designing the overall logical & physical Data warehouse/Data-mart data model and data architectures to support business requirements.
- Explore prebuilt ETL metadata, mappings and Confidential metadata and Develop and maintain SQL code as needed for SQL Server database.
- Performed data manipulations using various Talend components like tMap, tJavarow, tjava, tOracleRow, tOracleInput, tOracleOutput, tMSSQLInput and many more.
- Analyzing the source data to know the quality of data by using Talend Data Quality.
- Troubleshoot data integration issues and bugs, analyze reasons for failure, implement optimal solutions, and revise procedures and documentation as needed.
- Worked on Migration projects to migrate data from data warehouses on Oracle/DB2 and migrated those to Netezza.
- Used SQL queries and other data analysis methods, as well as Talend Enterprise Data Quality Platform for profilingand comparison of data, which will be used to make decisions regarding how to measure business rules and quality of the data.
- Worked on TalendRTX ETL tool, develop jobs and scheduled jobs in Talend integration suite.
- Writing Netezza SQL queries to join or any modifications in the table.
- Used Talend reusable components like routines, context variable and globalMap variables.
- Responsible to tune ETL mappings, Workflows and underlying data model to optimize load and query Performance.
- Developed Talend ESB services and deployed them on ESB servers on different instances.
- Implementing fast and efficient data acquisition using Big Data processing techniques and tools.
- Monitored and supported the Talend jobs scheduled through Talend Admin Center (TAC) .
- Developed Oracle PL/SQL, DDLs, and Stored Procedures and worked on performance and fine Tuning of SQL.
Environment: Talend 6.4.1/5.6, Netezza, Oracle 12c, IBM DB2, TOAD, Aginity, BusinessObjects 4.1, MLOAD, SQL Server 2012, XML, SQL, Hive, Pig, SQL, PL/SQL, HP ALM, JIRA, Amazon EC2, Apache Hadoop 1.0.1, MapReduce, Confidential, CentOS 6.4, HBase, Kafka, Scala, Elastic Search, Hive, Pig, Oozie, Flume, Java (jdk 1.6), Eclipse, Sqoop, Ganglia, Hbase.
Confidential, Atlanta, GA
Sr. Talend Big Data Developer
Responsibilities:
- Created Talend Mappings to populate the data into dimensions and fact tables.
- Broad design, development and testing experience with Talend and BigData Integration Suite and knowledge in Performance Tuning of mappings .
- Development of staging, Data warehouse scripts and deployment
- Writing specifications for ETL processes.
- Developed optimal strategies for distributing the web log data over the cluster, importing and exporting the stored web log data into Confidential and Hive using Scoop.
- Collected and aggregated large amounts of web log data from different sources such as webservers, mobile and network devices using Apache Flume and stored the data into Confidential for analysis .
- Validating customer requirements, performing analysis to fit in the Jasper reports framework
- Design of Jasper embedded report components to embed them into customer’s application
- Development of Jaspersoft reports, dash boards UI components, writing complex queries to support the interactive reporting logic.
- Implemented Change Data Capture technology in Talend in order to load deltas to a DataWarehouse.
- Experienced in using Talend database components, File components and processing components based up on requirements.
- Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend Integration Suite .
- Development of reports using various chart types
- Coordination with Offshore team, providing them guidance and clarifications related to reports, underlying queries
- Perform validation check and deployment of reports to customer’s staging environment Client, Business objects.
Environment: Talend Studio 6.0.1 , XML files, Flat files, Talend Administrator Console, IMS, Agile Methodology, Amazon EC2, Apache Hadoop 1.0.1, MapReduce, Confidential, CentOS 6.4, HBase, Kafka, Scala, Elastic Search, Hive, Pig, Oozie, Flume, Java (jdk 1.6), Eclipse, Sqoop, Ganglia, Hbase.
Confidential, BOSTON, MA
Sr. Talend ETL/MDM Developer
Responsibilities:
- Designed and developed a new ETL process to extract and load Vendors from Legacy System to MDM by using the Talend Jobs.
- Talend MDM 5.1.1.Designed and developed the Business Rules and workflow system.
- Developed Talend ETL jobs to push the data into Talend MDM and develop the jobs to extract the data from MDM.
- Developed data validation rule in the Talend MDM to confirm the golden record.
- Developed data matching/linking rules to standarize the record in Talend MDM.
- Writing specifications for ETL processes.
- Installation and configuration of MySQL database servers and Amazon Redshift.
- Experience in TalendMDM and BigData for functionality integration and creating data model, data container, views and workflows.
- Developed mappings to load Fact and Dimension tables, SCD Type 1 and SCD Type 2 dimensions and Incremental loading and unit tested the mappings.
- Rolled out documentation for the ETL Process, early Data Inventory, and Data Profiling
- Implementing Data Integration process with Talend Integration Suite 3.2/4.2./5.1.2/5.2.2
- Designing, developing and deploying end-to-end Data Integration solution.
- Used different components in talend like tmap, tmssqlinput, tmssqloutput, tfiledelimitede, tfileoutputdelimited, tmssqloutputbulkexec, tunique, tFlowToIterate, tintervalmatch, tlogcatcher, tflowmetercatcher, tfilelist, taggregate, tsort, tMDMInput, tMDMOutput, tFilterRow.
- System design architecture, Data ware house design, ingestion
- Development of staging, Data warehouse scripts and deployment
- Led a team of 5 developers working on BI reports, one UI specialist
- Coordinated with onsite and Offshore team located in India
- Extensively worked on Data Mart Schema Design.
- Developing the ETL mappings for XML, .csv, .txt sources and also loading the data from these sources into relational tables with Talend ETL
- Extensively worked on Data Mart Schema Design.
- Working with Healthcare EDI 834.
- Backup / Restore of databases and Writing Complex Queries
- ExperienceonTeradata development and performancetuning.
- Analysis, design, development using Amazon Redshift.
- Used efficient optimization techniques to design ETL Scripts.
- Loaded data into Infobright using Talend, FastLoad, MultiLoad, and shell scripts.
- Validating customer requirements, performing analysis to fit in the Jasper reports framework
- Design of Jasper embedded report components to embed them into customer’s application
- Development of Jaspersoft reports, dash boards UI components, writing complex queries to support the interactive reporting logic
- Development of reports using various chart types
- Coordination with Offshore team, providing them guidance and clarifications related to reports,underlying queries
- Perform validation check and deployment of reports to customer’s staging environment
Environment: Talend Integration Suite 5.4.1, Talend MDM, Microsoft SQL Server 2008, Oracle 9i, Windows XP, Flat Files.
Confidential, Charlotte, NC
Sr. ETL Developer
Responsibilities:
- Analyzed business requirements to design, develop, and implement highly efficient, highly scalableInformatica ETL processes.
- Worked closely with architects and data analysts to ensure the ETL solution meets business requirements.
- Interacted with key users and assisted them with various data issues, understood data needs and assisted them with data analysis.
- Involved in Documentation, including source-to-target mappings and business-driven transformation rules.
- Designed mappings that loaded data from flat-files to the staging tables.
- Involved in designing the end-to-end data flow in a mapping.
- Designed, developed and implemented scalable processes for capturing incremental load.
- Used a wide range of transformations such as the Source qualifier, Aggregator, Expression, lookup, Router, Filter, Sequence Generator, Update Strategy and Union Transformations.
- Used FTP connection to store, stage and archive Flat Files.
- Developed structures to support the front end Business Objects reports.
- Extensively worked with Repository Manager, Designer, Workflow Manager and Workflow Monitor.
- Developed Informatica mappings, sessions and workflows.
- Developed and executed test plans to ensure that the ETL process fulfills data requirements.
- Worked with required Support Teams on High Critical Bridge for any Prod issues.
- Involved in tuning Informatica Mappings and Sessions as well as tuning at the database level.
- Participated in peer-to-peer code review meetings.
- Data warehouse design integration and development of staging, data warehouse scripts and deployment.
Environment: informatica power center 9.5.1,Talend open studio, Netezza, Teradeta, UNIX, Big DATA.
Confidential, Philadelphia, PA
ETL Developer
Responsibilities:
- Developed Talend ETL jobs to push the data into Talend MDM and develop the jobs to extract the data from MDM.
- Designed and developed a new ETL process to extract and load Vendors from Legacy System to MDM by using the Talend Jobs.
- Talend MDM 5.1.1.Designed and developed the Business Rules and workflow system.
- Developed data validation rule in the Talend MDM to confirm the golden record.
- Developed data matching/linking rules to standarize the record in Talend MDM.
- Writing specifications for ETL processes.
- Installation and configuration of MySQL database servers and Amazon Redshift.
- Experience in TalendMDM for functionality integration and creating data model, data container, views and workflows.
- Developed mappings to load Fact and Dimension tables, SCD Type 1 and SCD Type 2 dimensions and Incremental loading and unit tested the mappings.
- Rolled out documentation for the ETL Process, early Data Inventory, and Data Profiling
- Implementing Data Integration process with Talend Integration Suite 3.2/4.2./5.1.2/5.2.2
- Designing, developing and deploying end-to-end Data Integration solution.
- Used different components in talend like tmap, tmssqlinput, tmssqloutput, tfiledelimitede, tfileoutputdelimited, tmssqloutputbulkexec, tunique, tFlowToIterate, tintervalmatch, tlogcatcher, tflowmetercatcher, tfilelist, taggregate, tsort, tMDMInput, tMDMOutput, tFilterRow.
- System design architecture, Data ware house design, ingestion
- Development of staging, Data warehouse scripts and deployment
- Writing specifications for ETL processes
- Validating customer requirements, performing analysis to fit in the Jasper reports framework
- Design of Jasper embedded report components to embed them into customer’s application
- Perform validation check and deployment of reports to customer’s staging environment
- System design architecture, Data ware house design, ingestion
- Development of staging, Data warehouse scripts and deployment
- Installation and configuration of MySQL database servers.
- Good experience in T-SQL Querry.
- Loaded data into Infobright using Talend, Fast Load, and Multi Load.
- ExperienceonTeradata development and performancetuning.
- Worked on Star Join Schema Modeling, Snow Flake Modeling
- Worked closely with the source team and users to validate the accuracy of the mapped attributes.
- Monitoring System Performance.
- Writing SQL code and code review for evaluating the data quality.
- I have good experience in Jquery and java development with iway concept.
- Unit tested the developed ETL scripts, created test SQLs, and handled UAT issues
- Backup / Restore of databases and Monitoring System Performance
- Good working experience in MDM Developer and Rolled out documentation for the ETL Process, early Data Inventory, and Data Profiling
Environment: Talend MDM, Talend Integration Suite 4.1, InformaticaPowerCenter 8.6.1, Oracle 11g,Teradata, UNIX Shell, PL/SQL, SQL * Plus, SQL * Loader, TOAD, Windows, UNIX, XML, flat files
Confidential, Dublin, OH
ETL Engineer
Responsibilities:
- Implementing Data Integration process with Talend Integration Suite 3.2/4.2./5.1.2/5.2.2
- Designing, developing and deploying end-to-end Data Integration solution.
- Used different components in talend like tmap, tmssqlinput, tmssqloutput, tfiledelimitede, tfileoutputdelimited, tmssqloutputbulkexec, tunique, tFlowToIterate, tintervalmatch, tlogcatcher, tflowmetercatcher, tfilelist, taggregate, tsort, tMDMInput, tMDMOutput, tFilterRow.
- System design architecture, Data ware house design, ingestion
- Development of staging, Data warehouse scripts and deployment
- Validating customer requirements, performing analysis to fit in the Jasper reports framework
- Design of Jasper embedded report components to embed them into customer’s application
- Perform validation check and deployment of reports to customer’s staging environment
- Created Talend&Informatica mappings for initial load and daily updates and also involved in the ETL migration jobs from Infromatica to Talend.
- Developed several mappings (Source Qualifier, Lookup, Filter, Joiner, Aggregate, Sequence
- Generator, Expression, Lookup, Router, Normalizer, and Update Strategy) to load data from multiple sources to data warehouse.
Environment: Talend Integration Suite 4.0, Informatica 8.6.1, Oracle 11g, SQL Developer Client, SQL Server 2008/2005, HP Quality Center
Confidential, Georgia
Data Warehouse Developer
Responsibilities:
- System design architecture, Data ware house design, ingestion
- Development of staging, Data warehouse scripts and deployment
- Writing specifications for ETL processes
- Validating customer requirements, performing analysis to fit in the Jasper reports framework
- Design of Jasper embedded report components to embed them into customer’s application
- Development of Jaspersoft reports, dash boards UI components, writing complex queries to support the interactive reporting logic
- Development of reports using various chart types
- Coordination with Offshore team, providing them guidance and clarifications related to reports, underlying queries
- Perform validation check and deployment of reports to customer’s staging environment Client, Business objects.
Environment: Informatica Power Center 8.6.1, Oracle 10g, PL/SQL,TOAD, XML Files, Unix, F-Secure, SVN, SSH