Tech Architect (etl Datastage Architect/hadoop Developer) Resume
CharlottE
SUMMARY:
- Good experience in testing domain in handling both manual and automated testing projects in mainframe, middleware and UNIX platforms. Have special knowledge on the automated testing tools such as:
- Good exposure to IBM MQ series in establishing batch/Real - time MQ interfaces across Operational platforms such as Z/OS Mainframes, Linux, IIS Datastage and Middleware applications.
- Good knowledge on the Data warehousing concepts - Star/Snow-flake dimensions & data modeling.
- Part time developer in the below technologies on need basis while building solutions/frameworks in Real-time Data Integration area:
- Theoritical knowledge on Informatica powercenter tools.
- Good understanding in the execution of Waterfall and Agile methodology projects.
- Effective in cross-functional and global environments to manage multiple tasks and assignments concurrently.
- Proficiency in managing the huge team in resource allocation, technical mentoring, performance appraisal and other regular team activities.
- Interested in all aspects of the software development life-cycle, from design through development, deployment, Production Support and Data Quality & Data Management.
- Proficient and diverse experience in IT consulting with 12 years of practice in Business Intelligence - ETL, Hadoop and ZOS Mainframe technologies in establishing critical OLTP/OLAP applications. Demonstrated success in consulting, requirement analysis, data modeling, designing, developing, testing, integrating and implementing novel Business solutions aligned with critical requirements, under stringent deadlines, for ace business corporations based out of US and Canada.
- Specialized in implementing trending technologies with the focus of re-engineering the legacy systems to enrich the operating model and to yield better efficiency, optimization and cost-savings.
- Holding 12 years of working experience with major contribution in Banking & finance domain.
- Associated with Confidential for past 1.8 years, Confidential for over 9 years and 1 year with Confidential relationship in various development assignments.
- 6+ Years of Business Intelligence solutions (IBM Info sphere Information Server - Data stage), around 3 years of Hadoop ecosystem and 3+ Years of ZOS Mainframes expertise in application design/ development.
- 1 Year of Quality Assurance Experience with specialized knowledge in Automated Testing tools.
- A track record in executing and implementing various platform/technology migration and bank acquisition projects in both ETL and ZOS Mainframes technologies.
- Strong Data Modelling/SQL skills with RDBMS databases such as DB2, HP Vertica, Teradata and Oracle for OLTP and OLAP systems.
- Extensive work experience on the IBM IIS tools as listed below.
- Infosphere Business Glossary
- Infosphere FastTrack
- Infosphere Metadata workbench
- Infosphere Datastage 8.5 / 9.1 / 11.5
- Infosphere Quality Stage
- Infosphere Services Director
- Infosphere Information Analyzer
- Working experience on the Hadoop ecosystem - HDFS, Hive, Apache Spark (Scala/Python), Sqoop, Oozie and Apache Kafka.
- Have acquired specialized skills on the Real-Time Data Integration in IIS Datastage such as Web services API Integration, MQ series and Java Integration.
- Well versed with Version control tools as listed below: Clearcase, SubVersion (SVN), Bit Bucket, Bamboo / Artifactory.
- Extensively worked in Mainframe technology for OLTP Applications handling customer, account and services related data. Have in-depth hands-on expertise in the listed tools and technologies.
- COBOL
TECHNICAL SKILLS:
Operating Systems: Unix, Linux, Win95/NT/2000/XP, IBM Mainframe Z/OS
Languages/Scripting: Unix Shell Scripting, SQL,PL/SQL,COBOL,REXX, CICS, JCL, Eazytrieve, Core Java, Groovy
Databases: IBM - DB2, Oracle 10g, HP Vertica, Teradata 14, Netezza NoSQL DB concepts: Cassandra
Big Data: Hadoop Ecosystem (HDFS, Hive, Apache Spark (Scala/Python), Sqoop, Oozie and Apache Kafka)Conceptial knowledge on Amazon EMR (AWS)
ETL/other Supporting Tools: IIS DataStage 8.5 / 9.1 / 11.5 IIS Information Services Director IIS Information Analyzer IIS FastTrack, IIS Business glossary, IIS Metadata workbench Tectia Terminal WinSCP Toad for Oracle/DB2 SQL Developer for Oracle ClearCase Version control Subversion Version control Bitbucket, Bamboo, ArtifactoryMainframe Supporting ToolsFM, FMDB2, FILD, Fault Analyser, FILA,Expeditor, Debug tool, BMS Maps, QMF, Changeman, Endevor
Automated Testing Tools: Quick Test Professional (QTP), Quality Centre (QC)
Scheduling Tools: Autosys 4.5/11, CA7, ESP
Other Technologies: IBM MQ Series Messaging formats: XML, JSON, OAG
PROFESSIONAL EXPERIENCE:
Confidential
Tech Architect (ETL Datastage Architect/Hadoop Developer)
Technologies: ETL - IIS tools (Datastage), Unix, Core Java, MDM, DB2, Netezza, Hadoop - HDFS, Hive, Apache Spark (Scala, Python), Sqoop, Oozie and Apache Kafka
Responsibilities:
- Perform Tech Architect role in providing recommendations with BI solutions, identifying opportunities, driving ETL migration programmes and providing architectural directions to other regular development projects.
- Execute Server/platform migration projects to upgrade the technologies from old version to latest version to yield benefits in terms of new capabilities, performance, efficiency, scalability, reciliency and maintainabilty.
- Build controls to improve the application on the areas such as data governance, data quality and system architecture
- Re-architect the application to include capabilites for high-availablity with very minimal/zero outage window during maintenance.
- Tune the application configuration to handle the growing online traffic and data storage.
- Co-ordinate in building server stacks to install MDM, IIS, DB2 applications
- Migrate the application code components seamlessly into the new platform and test the functionality of the application to ensure due-diligence in the version upgrade.
- Recommend solutions for production parallel and build strategies for cutover from old to new platform.
- Migrated some of the Netezza batch analytical processes into Hadoop datalake solutions to address the growing data volume concerns. Also, performed near-realtime synchup by leveraging kafka consumer APIs.
- Develop batch processes using sqoop utilities to pull the data by connecting to various RDBMS sources across enterprice (or) by receiving full files and load it into hadoop data lake. Perform transformation by creating functions/UDF in apache spark (scala).
- Develop outbound processes to provision the data to the consumer application leveraging Apache spark SQL.
- Leverage code migration and version control tools such as Bitbucket, Bamboo, Artifactory.
- Complete team co-ordination involving task allocation among team members, Client Interaction, Onshore - Offshore working model.
Confidential, Charlotte
Technical Lead ( ETL - Datastage Lead / Hadoop Developer)
Technologies: ETL - IIS tools (Datastage), Unix, Core Java, Groovy, HP Vertica, Teradata, Hadoop - HDFS, Hive, Apache Spark (Scala, Python), Kafka, Sqoop and Oozie
Responsibilities:
- Performing ETL Tech Architect role in providing recommendations with BI solutions, identifying opportunities, driving ETL migration programmes and providing architectural directions to other regular development projects.
- Have been leading the CRH ADS (Authorized Data Source) - Provisioning effort from the front to migrated the provisioning functionality from the legacy system to ETL Datastage platform. As part of this effort, more than 250 batch outbound flows are migrated with the impact of ~40 downstream applications.
- Re-architecting and re-engineering the legacy Unix-Vertica based batch analytical flows into IIS Datastage platform to improve the Modularity, Maintainability, Cost, Stability and Scalability of CRH system has been the key focus of my engagement in this project.
- Data modeling in HP Vertica DB and Teradata to satisfy OLAP needs.
- Worked on building an analytics process using Hadoop ecosystem (HDFS, Hive, Java Spark, Sqoop, Oozie and Kafka) to rewrite core analytical batch processing.
- Develop Apache spark application as a computational framework to derive key business elements using the terabytes of customer centric transactions/events.
- Establish Apache Kafka interface to receive real-time messages from the producers. Develop application to extract, parse, validate, transform and load the data into the target hive table. Publish the derived data elements to the consumer application via Kafka topics.
- Utilize the Sqoop utilities to pull and push the data from RDBMS to HDFS / Hive platform.
- Create the oozie workflow to sequence the daily/weekly/monthly jobs.
- Worked on various solutions, proposals and proof of concepts in IIS Datastage, Java-Groovy with the focus of simplifying, re-engineering, accelearating, improving and transforming the legacy systems into more flexible and opportunistic environment.
- This analytics system (CRH) handles batch flows of both Operational and analytical nature involving enormous amount of customer profile, account, relationship and Indicator information by storing, maintaining history, enriching, deriving and publishing data via CORE and VAP (Value Added Processes) modules. Gathering business needs and providing tech solutions has been one of my major responsibility in this application.
- Complete team co-ordination involving task allocation among team members, Client Interaction, Onshore - Offshore working model.
Confidential, Charlotte
Technical Lead (ETL Datastage - Senior Developer)
Technologies: ETL - IIS tools (Datastage), Z/OS Mainframes, Unix shell programming, MQ, DB2, Oracle 10g
Responsibilities:
- Analyse the mainframe process end to end and preparing mainframe specification document.
- Review mainframe functional specifications and preparing ETL technical design document.
- Review High-level & low-level ETL design documents as per the requirements specification documents and presenting it with clients for the approval.
- Create new business category and add new business terms related to the ETL process in IIS business glossary.
- Create the table definition for the source and target files and share the metadata to the Metadata repository and perform Fasttrack mapping to map source vs target using IIS FastTrack.
- Perform business terms linking for the shared file structures in IIS metadata workbench and perform data lineage to track the flow end to end.
- Migration of expensive platform meant for analytical work (Teradata) to a cheaper/scalable OLTP system (Oracle) by migrating the data and application processes into Oracle platform.
- Make sure Data Quality (DQ) checks are in place for the designed ETL process.
- Performance tuning the SQL queries.
- Scheduling the batch flow using Autosys R4.5/R11 version in all the Non-Prod and Prod environment.
- Utilize the version control tools such as Rational Clearcase/SVN integrated with IIS Information server manager in migrating the job components from Dev till Prod environment.
- Have used Real-time data integration stages for this project to leverage the Middleware web services from ETL platform.
- Web services transformer stage to send the request from ETL to middleware application and get back the response.
- MQ request-response stages to send the request from ETL to middleware application and get back the response.
- Have used the Bulk load option to load high volume data into the DB2 table.
- Have used Quality stages to detect the duplicate customer profiles present in Customer DB and combine them into single profile by survivorship rules.
- Have extensively worked in establishing the interactions from Linux platform to MQs for GET and PUT MQ operations using UNIX scripts.
- Worked in Quality stages for data profiling and to investigate the bad data.
- Coordinate with ETL Platform support team for install related activities and issues if any.
Confidential, Charlotte
Technical lead (Mainframes / ETL Datastage)
Technologies: ETL - IIS tools, Z/OS Mainframes, Unix shell programming, DB2/Oracle 10g
Responsibilities:
- Existing DB2 database containing financial/non-financial information is normalized to a new Data model to improve the performance and efficiency of CRUD operations, expand the account holding/servicing capabilities and better handle the account data of the Application.
- Designing and developing the application’s ETL batch and mainframe real-time services to adopt to new data model.
- Establishing real-time interfaces such as mainframe to middleware interactions via MQ request-response, via DB2 stored procedures and mainframe to mainframe interactions using COBOL/DB2/CICS technologies.
- Performed multiple roles such as DB2 data modeller in designing the new data model, as system architect in designing the batch/real-time flows, leading the mainframe real-time services development and developing the ETL batch flows.
- Developed ETL generic load frameworks to carry out table wise loads effectively every night with 300 million data. Created generic framework using shared container for Header/trailer validation, record count determination, removing unprintable characters and Application specific rule engines etc.
- All the standards, best practises and processes such as adding business terms via business glossary, FastTrack mapping, business terms linking and performing data lineage in metadata workbench were followed with no exceptions.
- Have handled substantial volume of data to generate Statements information and “account maintenance fee calculation” information. This process had critical functionalities involving direct DB2 lookup, complex validations, extensive reformatting, interaction with Middleware application via MQs and XSLT reporting.
- Have extensively worked in XML input and XML output stages in generating nested XML message for backend batch maintenance by leveraging Middleware update APIs via MQ.
- Have created complex/high volume CARI using CDC/DIFF stages (cross application referential Integrity) to synch various upstream application data.
- Worked in couple of PMRs (Problem Management Record) with IBM product team to fix:
- Idle timeout issue in DB2 direct lookup when multiple DB2 direct lookup is designed in a single parallel job.
- MQ request-response issue due to missing MQM libraries in the compute node linux servers.
- Have established NDM setup for various interacting platforms.
Confidential
Developer (Mainframes)
Technologies: Z/OS Mainframes- COBOL, CICS, JCL, Eazytrieve, DB2
Responsibilities:
- Participating in the JAD sessions/dispatch meetings to identify the application impact.
- Understanding the project charter, business requirement document and preparing the high level and low level design documents for the application.
- Coordinating with offshore team and to discuss about design strategies, risks and mitigation.
- Designing/Creating new data model based on project need.
- Developing the real-time processes in COBOL, CICS, DB2, MQ driver routines.
- Developing the batch processes in COBOL, DB2, JIL, Eazytrieve and SORT Utilities.
- Used CA7 Scheduler to schedule and execute the jobs based on the calendar schedule.
- Used SDSF/SAR to monitor status of the jobs.
- Documented test plans, test cases, test scripts and involved in all the phases of Testing executions such as Unit Testing, component Integration testing, System Integration Testing, dress rehearsal, Performance testing and User acceptance testing.
- Pre-implementation activities to ensure readiness on code deployment, Risks & Mitigation planning and production support handoff presentation.
- Deployment of codes into production and post-implementation activities covering warranty support.
- Handled multiple bank acquisition projects in converting/loading the huge volume account data of acquiring bank to adopt to Confidential systems.
- Worked in various data transition projects within bank internal systems. Consumer data, credit card, Mortgages and loans are the type of accounts converted during these transitions.
- Worked in multiple BAU strategic initiatives in establishing various real-time interfaces for associate/consumer facing channels (Online banking) and batch services b/w FFD application and other account SORs.
- Built various tools and re-usable components using CICS programming with BMS Maps and REXX programming to automate the repetitive tasks and to reduce manual effort.
Confidential
Test Automation Developer
Technologies: Z/OS Mainframes, Quick Test Professional (QTP), Quality Center (QC)
Responsibilities:
- Analysing/understanding the business requirements of the project through various initial meetings with business partners and other stakeholders.
- Preparing the Test plan for the various cycles of testing execution.
- Preparing the System Integration Test scripts containing test scenarios and test cases to be tested for each cycle based on priorities of business requirements.
- Participating in design reviews and providing suggestions in improving the design of the process based on business needs.
- Preparing test data based on the test script to be executed.
- Covering all the possible positive and negative test cases from E2E.
- Communicating with Test Lead/Manager, development team and project team on the progress of testing.
- As the major in this project, an Automated Regression Testing framework is built out to reduce the manual effort in repetitive test executions for regression testing the non-impacted functionalities of the application. Used QTP and QC tools with VBA scripting in building out the framework.
- Followed all the bank testing standards and best practises to improve the quality of testing.
- Defects were logged and tracked via HP-Quality Centre tool.
- Coordinating with development team in determining the resolution for the defects faster.