Talend Big Data Developer Resume
SUMMARY
- 10 years of IT experience on Client - Server architecture, Business Intelligence applications in various phases of IT projects such as analysis, design, deployment, coding and testing using Talend, SSIS and DataStage on OLAP and OLTP environments for Banking, Financial, Insurance and Health care clients.
- 5+ years of experience using Talend Integration Suite (5.0/5.1.1/5.6/6.0/6.2/6.4 ) Talend Open Studio for Big Data 6.2, Talend Big Data Enterprise Studio 7.1/7.0/6.4/6.2.
- Extracted data from multiple operational sources including flat files, RDBMS tables, and legacy systems for loading staging area, Data warehouse, Data Marts using SCDs (Type 1/Type 2/ Type 3) loads.
- Experienced in creating complex Mappings, apply transformations, and load into various data sources like Snowflake, Cassandra, Mongo DB, SQL Server, Oracle, DB2 and Flat Files.
- Extensively used ETL methodology for performing Data Extractions, Transformations and loading using Talend and designed data conversions from wide variety of source systems.
- Expertise in interaction with end-users and functional analysts to identify and develop Business Requirement Documents (BRD) and Functional Specification documents (FSD).
- Highly Proficient in Agile, Test Driven, Iterative, Scrum and Waterfall software development life cycle.
- Expertise in creating User Stories and Execution.
- In depth understanding of the Gap Analysis i.e., As-Is and To-Be business processes and experience in converting these requirements into Technical Specifications and preparing Test Plans.
- Expertise using Talend components tHL7Input, tFixedFlowInput, tContextLoad, tSleep, tFileTouch, tFileCopy, tFileInputFullRow, tFileExist, tFileList, tDenormalize, tWriteJSONField, tMongoDBInput, tMongoDBOutput, tPrejob, tPostjob, tReplicate, tParallelize, tSendemail, tDie, tUnique, tFlowToIterate, tIterateToFlow, tFileInputJSON, tWriteXMLFiled, tFileInputDelimited, tFileOutputDelimited, tLibraryLoad.
- Utilized tStatsCatcher, tDie, tLogRow, tWarn to create a generic joblet to store processing stats.
- Experienced in creating Generic schemas and creating Context Groups and Variables to run jobs against different environments like DEV, UAT and PROD.
- Experienced in bringing the data from Multiple Source systems are ingested into a Murky Lake with Different formats (i.e. Delimited, Positional, flat files).
- Experienced in applying Source CDC techniques in Staging Area to determine only the changed records before passing onto ETL phases and, applying Business Rules in the ETL Processing and goes through Target CDC Process before loading onto Clean Data Lake.
- Experienced in handling any kind of RDBMS data source to import using Sqoop Commands in Talend.
- Experienced in handling each entity import independently and should not break main ingestion flow on failure using Talend.
- Experience in creating spark batch job configuration like core, executors, drivers and memory jobs using Talend Big Data Spark.
- Expertise in identifying HL7 2.5.1 segments and validating the message structure using HAPI API validator using Talend.
- Experienced in creating MFT (Managed File Transfer) Mock jobs for HL7 2.5.1 LAB and ADT transactions in Talend Big Data 6.2.
- Experienced in converting pipe delimited data Ex: HL7V 2.5.1, into JSON structure using Talend Components.
- Thorough knowledge of addressing Performance Issues and Involved in query tuning, Index tuning, Data Profiling and other database related activities.
- Created sub jobs in parallel to maximize the performance and reduce overall job execution time with the use of parallelize component of Talend in TIS and using the Multithreaded Executions in TOS.
- Created Execution Tasks in Talend Administration Center on jobs that are either saved in SVN or in Pre-generated Studio as Zip files.
- Experienced in versioning, importing and exporting Talend jobs. Set up Triggers for Talend jobs in Job conductor.
TECHNICAL SKILLS
ETL/BI Tools: Talend Big Data Enterprise Studio 7.0/6.4/6.2, Talend Open Studio for Big Data 6.2, AnalytixDS Mapping Manager, SSIS, DTS Data Transformation Services.
Databases: Oracle 11i/10g/9i, MS SQL Server, DB2, Cassandra DB, MongoDB, Snowflake DB.
Programming Languages : SQL, T-SQL, PL/SQL, CSS, HTML, XML, javascript.
Development Tools: Putty, File Zilla, SQL Developer, Toad, DBeaver 6.0, Teradata SQL Assistant, SQL Server Management Studio.
Operating Platforms: MS Windows2010/2000/XP/NT, Mac OS, Unix, Linux
Big Data/Other Utilities: Hadoop, HDFS, On Prem Hue Browser, AWS Hue Browser, Amazon S3 buckets, Talend Admin Console (TAC), GitHub, Jenkins, Autosys scheduler, Putty, Postman.
PROFESSIONAL EXPERIENCE
Confidential
Talend Big Data Developer
Responsibilities:
- Calling CM daily huddles to provide update on the progress of a story to the Scrum Master and to notify blocker and dependency if any.
- Developed ETL job that extracts all the data from a deep nested XML structure to and load the data into tables using Talend Real Time Big Data 6.4.
- Develop ETL mappings for various Sources (.TXT,.CSV,.TSV,XML,JSON,XML,Positional files) and also load the data from these sources into relational tables.
- Designed ETL job with tRouteInput and exit with tRouteOutput components and used tLogCatcher for exception/error catching and use of tRouteFault to pass the error/exception message.
- Extensively involved in the Presales and Communication manager projects in building ETL jobs using web service Talend Real Time Big Data components.
- Worked on developing an ETL as a web service to populate ASH database for AIS data and another ETL as a web service to populate Pending-business feed from ALIP (admin system).
- Experienced in bringing operational source data to a Data Lake to use of internal systems to standardize and store the data in the Clean Lake that it can be provisioned to multiple consumers.
- Experienced in implementing CDC framework using Talend Big Data standard Jobs to provision data to Cassandra and Snowflake Databases.
- Experienced in provision data from Cassandra to snowflake using Talend Big Data standard Jobs.
- Expertise in reading and writing data from multiple source systems like oracle, Microsoft SQL, MYSQL.
- Access, DB2 and source files like delimited files, Excel, Positional and CSV files.
- Used over 20+ Components in Talend Big Data Like (, tHDFSConnection, tHBaseConnection, tHiveConnection, tHDFSInput, tHDFSOutput, tHDFSList, tHDFSPut, tHDFSProperties, tHDFSExist, tHDFSDelete, tHBaseInput, tHBaseOutput, tHDSFCopy,tHDFSGet, tHiveCreateTable, etc).
- Used debugger and breakpoints to view transformations output and debug mappings.
- Involved in creating, tracking defects using ALM client to manage deliverables release to QA and prod check out.
- Experienced in creating Sqoop Ingestion framework from RDBMS to HDFS Murky Lake folders.
- Extensively involved in creating layouts using On-premise infrastructure in HDFS murky Lake and provision clean lake data to Cassandra, snowflake using CDC process.
- Experienced in bringing the data from Multiple Source systems are ingested into a Murky Lake with Different formats (i.e. Delimited, Positional, flat files).
- Experienced in writing PySpark and Scala SQL queries to read parquet and Avro file format data from AWS S3 Hue browser.
- Experienced in applying Source CDC techniques in Staging Area to determine only the changed records before passing onto ETL phases and, applying Business Rules in the ETL Processing and goes through Target CDC Process before loading onto Clean Data Lake.
- Developed Spark Big Data Batch jobs to create the layouts in HDFS folders to provision the data to NOSQL database.
- As a part of internal ETL framework, developed ETL transformation maps using AnalytixDS Mapping Manager tool.
- Worked Extensively on TAC (Talend Admin Console), where we Schedule Jobs in Job Conductor and set execution plan for the dependency jobs to run parallel/ On-SubJob OK.
- Migrating code and release documents from DEV to QA (UAT) and to Production.
- Updating the status of the issue assigned on Kanban board in agile project management tool to visualize, track the effort and changing the status as when work-in-progress in different phase’s analysis, ready to dev, development, ready for QA and Prod release to maximize efficiency (or flow).
- Actively participated in walk through the stories for the existing pod, identifying any KT needs before the start of next sprint and be ready to move on for Presales planning sprint.
- Actively participated in every month prod release deployments and support production issues.
Environment: Talend Real Time Big Data 6.2/6.4/7.0/7.1 , SQL Server Management Studio 2008, Flat files, XML, JSON, Git Hub Source Version Control, JIRA, ALM Defects Tracking, Windows7,Oracle 11i/10g/9i, Cassandra DB, MySQL, DBeaver 6.0, Postman, PCF, Putty, WinSCP, Snowflake DB, HBase, HDFS, HIVE, On-premise Hue file Browser, AWS S3 Hue Browser, Hadoop Cluster applications, AnalytixDS Mapping Manager.
Confidential
Talend Developer
Responsibilities:
- Attended Daily Scrum meetings to provide update on the progress on daily activities to the Scrum Master and also to notify blocker and dependency if any.
- Interacting with Solution Architects and Business Analysts to gather requirements and update Solution Architect Document.
- Developed in creating MFT (Managed File Transfer) Mock jobs for HL7 2.5.1 LAB and ADT transactions in Talend Big Data 6.2.
- Involved in creating mapping document and also involved in updating the BRD accord to the client requirement if any changes.
- Extensively worked in creating to read in property file and iterating over files in directory specified in Config file.
- Created HL7 Batch Files with appropriate FHS, BHS, BTS, and FTS.
- Created number of messages in the batch file HL7 file with the value specified as BatchFileChunkSize in the Config file.
- Decrypting the files from (SFSA) secure file storage area and start Talend ORU Job & Talend ADT Job, Archiving the input files in Archive directory.
- Validated HL7 2.5.1 message with HAPI API validator using tJavaFlex to generate ACK/NACK responses for the input source files and generating Audit information.
- Worked on Talend Jobs are created that test the kickoff setup and generate files that are representative of the files the NFT (Network File Transfer) system needs to process.
- Implemented in using standard properties files (eg.Java.util.properties) when the Job starts a parameter is passed that is the path and name of the Config file collect data needed and building output file during run.
- Worked with HL7 2.5.1 Batch files and Real Time HL7 files.
- Importing and Exporting Talend jobs from Git hub Source Version Control.
- Extensively worked on complex mapping for HL7 2.5.1 LAB data and ADT transactions like A01, AO3, and AO4.
- Involved in performance tuning for the existing Talend Jobs in Confidential ’s aerial hub.
- In-depth knowledge in identifying HL7 2.5.1 segments for ADT and LAB data.
- Extensively worked with tHL7Input, tFixedFlowInput, tContextLoad, tSleep, tFileTouch, tFileCopy, tFileInputFullRow, tFileExist, tFileList, tDenormalize, tWriteJSONField, tMongoDBInput, tMongoDBOutput, tPrejob, tPostjob, tReplicate, tParallelize, tSendemail, tDie, tUnique, tFlowToIterate, tIterateToFlow, tFileInputJSON,tWriteXMLFiled, tFileInputDelimited, tFileOutputDelimited, tLibraryLoad.
- Worked on Generic schemas and created Context Groups and Variables to run jobs against different environments like Dev, Test and Prod.
- Worked in setting up Talend Data Integration, Talend Platform on Windows.
- Created complex mappings in Talend 6.2 using tMap, tJava, tjavaRow, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher, tStatsCatcher etc.
- Created complex Mapping to pull data from Source, apply transformations, and load data into Mongo data base.
Environment: Talend Open Studio for Big Data 6.1/6.2, Talend Big Data Platform 6.1, MongoDB 3.2, HL7 2.5.1Version, Flat files, JSON, Git Hub Source Version Control, HAPI, and Windows7, Jenkins.
Confidential
Talend Developer
Responsibilities:
- Interact with Solution Architects and Business Analysts to gather requirements and update Solution Architect Document.
- Analyze and create low level design document (LLD) and mapping document.
- Performed analysis, design, development, Testing and deployment for Ingestion, Integration, provisioning using Agile Methodology.
- Attended Daily Scrum meetings to provide update on the progress of the user stories Rally and to the Scrum Master and also to notify blocker and dependency if any.
- Experienced in creating Generic schemas and creating Context Groups and Variables to run jobs against different environments like Dev, Test and Prod.
- Created Talend Mappings to populate the data into dimensions and fact tables.
- Broad design, development and testing experience with Talend Integration Suite and knowledge in Performance Tuning of mappings.
- Experienced in Talend Data Integration, Talend Platform Setup on Windows and UNIX systems.
- Created complex mappings in Talend 6.0.1/5.5 using tMap, tJoin, tReplicate, tParallelize, tJava, tjavarow, tJavaFlex, tAggregateRow, tDie, tWarn, tLogCatcher, etc.
- Created joblets in Talend for the processes which can be used in most of the jobs in a project like to Start job and Commit job.
- Developed jobs to move inbound files to vendor server location based on monthly, weekly and daily frequency.
- Implemented Change Data Capture technology in Talend in order to load deltas to a DataWarehouse.
- Created jobs to perform record count validation and schema validation.
- Created contexts to use the values throughout the process to pass from parent to child jobs and child to parent jobs.
- Developed joblets that are reused in different processes in the flow.
- Developed error logging module to capture both system errors and logical errors that contains Email notification and also moving files to error directories.
- Provided the Production Support by running the jobs and fixing the bugs.
- Experienced in using Talend database components, File components and processing components based up on requirements.
- Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Talend Integration Suite.
- Performed unit testing and also integration testing after the development and got the code reviewed.
- Involved in migrating objects from DEV to QA and then promoting to Production.
Environment: Talend Studio 6.0.1/5.5, Oracle 11i, XML files, Flat files, HL7 files, JSON, TWS, Hadoop 2.4.1, HDFS, Hive 0.13, HBase 0.94.21, Talend Administrator Console, IMS, Agile Methodology, HPSM