Hadoop Architect Resume
Charlotte, NC
SUMMARY:
- Around 9 years of IT experience as an Architect and Senior Developer
- Currently working as Hadoop architect
- Over 4 years of experience working on Hadoop Development
- Over 5 years of combined experience working as a Senior Java/ETL developer & Lead with clients such as Confidential (BOA).
- As a Hadoop developer, possess very good knowledge of Hadoop Architecture and Big Data Systems.
- Master in Syncsort DMX - h ETL implementations.
- Hands-on experience with cutting edge Big data ecosystem like Apache Hive, Pig, Sqoop and Oozie
- Performance tuning of Hadoop MapReduce Jobs and other skills
- Hands-on experience on Hive Queries to generate reports using Hive Query Language.
- Good knowledge on NoSql databases such as MongoDB
- Good understanding of MapReduce paradigm and experience writing MapReduce programs in Java
- Excellent knowledge on Data Modeling. Good understanding on Sybase PowerDesigner
- Excellent knowledge of MPP systems such as HP Vertica and Teradata
- Excellent experience in leading Onsite and Offshore teams, ETL Development, System Testing and Supporting an Enterprise Data Warehouse using the ETL tools like Syncsort DMX-h.
- Shell scripting in UNIX, Autosys scheduling.
- Evaluate and propose new tools and technologies to meet the needs of the organization.
- Well versed in using Agile Development Methodologies and Waterfall model of software development.
- Good Client Interfacing skills, working in tandem with various teams for Requirement Gathering etc.
- Good knowledge of Big Data, NoSQL, DW-BI Concepts, Banking and Telecom Domains.
TECHNICAL SKILLS:
Big Data Tools: Hadoop (HDFS), MapReduce, Hive, Pig, Sqoop, Oozie
ETL Tools: Syncsort DMX-h, DTL
NoSQL: MongoDb
RDBMS: HP Vertica, Teradata, Oracle
Data Modeling Tools: Sybase PowerDesigner
Schedulers: Autosys, Crontab
Programming Languages: Java (Threads, Files, Exceptions, Collections, JDBC), Shell Scripting in UNIX
PROFESSIONAL EXPERIENCE:
Confidential, Charlotte, NC
Hadoop Architect
Responsibilities:
- As a Technology Architect - Hadoop
- Leading end-to-end development of the Warehouse.
- Requirement Gathering & Interacting with BAs from Client side for mapping requirements.
- Creating HLDs, LLDs and Test Plans
- Supported code/design analysis, strategy development and project planning.
- Performance optimization like using distributed cache for small Datasets, Partition, Bucketing in hive and Map Side joins
- Custom implementation of handling login credentials generating dynamically
- Implementation of Change Data Capture (CDC) on Source Master History using Hashcode
- Custom implementation of gathering all Map Reduce attempt logs and generated metrics.
- Logging and Error handling mechanism for DMX-h. Incorporated custom logging mechanism for tracing errors using unique user defined error codes.
- Gathering, managing and reviewing Hadoop log files.
- Exported the analyzed data to the relational databases using Sqoop/TPT to Teradata for visualization and to generate reports for the BI team.
- Creating Hive tables and partitions for Hadoop Source Master
- Handled the imported data to perform transformations, cleaning and filtering using DMX-h Map Reduce and Hive
- Developed generic tool for generating dynamic HQL scripts for loading data into hive history tables.
- Writing HIVE Scripts for converting files to Parquet format.
- Performance tuning by implementing Dynamic Partitions, Buckets in HIVE
- Dynamically generated DMX Map Reduce DTL for Data Quality and Data Controls (DQ/DC) on Key Business Elements (KBE's)
- Ad hoc query execution and report generation based on customer needs.
Environment: Cloudera CDH5, Hadoop 2.3.0, Hive 0.12, Syncsort DMX-h 7.14.03, Syncsort DMX DTL, Sqoop, Autosys, Teradata and Linux.
Confidential, Charlotte, NC
Technology Lead
Responsibilities:
- As a Technology Lead ETL,
- Leading end-to-end development, testing and implementation of the Warehouse.
- Interacted with business partners in gathering the requirements and helping them prioritize their requirements.
- Participated in joint application development sessions with business and analyze the requirements.
- Worked with cross-functional team members to recommend technical solutions to business issues and system issues.
- Worked as offshore Lead managing the team and responsible for coordinating with onshore and understanding requirements.
- Assisted in creating the project plan, estimates and resource plan for enhancements.
- Creating HLDs, LLDs and Test Plans
- Created the best practices for HP Vertica and Shell script.
- Developed a generic Table driven utility using shell script to load the from SOR to Vertica RAW tables
- Developed a tool using shell script to compare the results of parallel run between Teradata and Vertica tables.
- Backup and recovery of SOR files and final extracts onto Hadoop HDFS
- Creating and maintaining Hive tables and partitions for Hadoop extract files
- Participated in all project level audits and pushing the project to Level 5.
- Trained the entire team based across different location on HP Vertica
- Performed Unit testing, integration testing, and provide bug fixes.
- Performed data validations using validation controls
- Responsible for all release activities across environments (Dev, UAT, Prod) and Mentoring team members and clients on Vertica.
Environment: HP-Vertica, Teradata, Cloudera CDH5, Hadoop 2.3.0, Hive, Shell script, Autosys and Linux.
Confidential
Team Lead
Responsibilities:
- Done requirement gathering from business partners.
- Worked on design, source to target mappings and detailed level design documents.
- Developed message flow, Java API's, web services and message set.
- Strived to adhere to the quality standards.
- Prepared unit test and system test plans.
- Worked with Quality Management team to validate the test plans and scripts.
- Performed component traceability, governance review were performed.
- Assisted Business team for end to end integration testing and user acceptance testing
- Supported Production deployment and verification in PROD.
Environment: Java 5, IBM Websphere message queue v.6.0, Hadoop, Hive, Tomcat Web Server, Oracle, Sybase, Crontab, Windows XP and Linux.
Confidential
Team Lead
Responsibilities:
- Project planning and process control for MCA&NM deliverables.
- Project Requirement gathering and analysis.
- Design analysis with the clients.
- Involved in providing Time and Cost estimation for the projects.
- Involved in capturing function points for the projects.
- Involved in preparing the Software Design documents (HLD) and reviewing it with the clients, developers, testers and coordinating the teams.
- Involved in Software Design document and Test case reviews.
- Involved in developing code for Service (signaling) and Web modules.
- Code tracing and bug fixing.
- Supporting the project teams to solve the production and UAT tickets.
- Involved in discussions with clients for projects and production issues. and Mentoring new team members and leading the team.
Environment: Java 5, Service Creation Environment (SCE), Tomcat Web Server, Oracle, Sybase, Crontab, Windows XP and Linux.
Confidential
Team Lead
Responsibilities:
- Analyzing system requirements.
- Participated in the requirement analysis
- Converting business requirements into design specifications
- Involved in writing Detail Design documents based on Functional Requirement documents.
- Used JSP and JavaScript to develop the front-end screens and core Java for backend of the application.
- Responsible for developing User Interfaces and Coding (IVR and WEB).
- Debugging the Source Code
- Performed Unit Testing and Integration Testing.
Environment: Java, Service Creation Environment (SCE), Tomcat Web Server, Oracle, Sybase, Crontab, Windows XP and Linux.