Head Of Data Engineering - Big Data Platform And Analytics Resume
SUMMARY:
- Over 18+ years of hands - on experience in Big Data, Cloud, Data warehouse, Architecture, Technology Leadership, Data Governance, Project management and software development
- Successfully Implemented Business driven technology strategies, architectures and roadmaps.
- Established and built Big Data on Cloud and on-prem, Data Integration COE and Shared Services from ground up.
- Implemented strategic Data Lakes, Data Warehouse, Business Intelligence, and Development projects.
- Experience in solutions practice using Kimball, Inman, and hybrid architecture, Transactional and Dimensional data modeling - OLTP, ETL, BI, Star Schema, and Snowflake schema.
- Makes project related decisions and provides input into decisions impacting the broader team.
- Subject matter expert across multiple technologies, architectures, and business applications with special emphasis on application/systems inter-dependencies.
- Built Engineering teams from ground up and successfully delivered solutions across the Enterprise, mentored, coached resources.
- Partnered with cross-functional teams and clients and provide leadership in matrix environment.
- Expert in vendor relationship management.
AREAS OF EXPERTISE:
- Cloud, Big Data, Enterprise Data Warehouse, Data Hub, Technology Roadmap and Strategy
- Data Integration, Data Quality, Metadata Management, Data Virtualization
- Enterprise Application Integration, Business Intelligence
- Databases - MPP, Relation, NoSQL, Legacy DBs
- Agile Methodology, Project/Program Management
PROFESSIONAL EXPERIENCE:
Confidential
Head of Data Engineering - Big Data Platform and Analytics
Responsibilities:
- Architected and implemented Multitenant platform from scratch.
- Built the Data Engineering team from ground up, attracted and retained talent.
- Architected and built the Enterprise Data Lake which consists of Enterprise Data Warehouse and IOT data from the sensors from various Manufacturing plants in near Real Time.
- Architected and built the Enterprise Data warehouse from scratch which consists of multiple datamarts.
- Reduced raw material costs of over 5M per year, improved operational efficiency, predictive maintenance to reduce, down time by leveraging IOT data in the Data Lake by leveraging predictive models.
- Established the roadmap and Strategy to integrate IOT data from multiple LOB manufacturing plants.
- The data lake served as a data hub to provide data feeds for multiple downstream applications to avoid point to point integration.
- Led the Architecture and design for the end to end solution using both batch and real time components.
- Implemented Micro services based architecture.
- Implemented Real time streaming of IOT data using MQTT and Kafka.
- Worked together with multiple business partners, senior management, Infrastructure and various stake holders to deliver scalable and robust solutions.
Environment: Cloudera EDH, HDFS, Hive, Impala,Talend Cloud, Talend ESB, Kafka, Spark, Micro services, Python, REST,R, QlikView, QlikSense
Confidential
Director - Big Data Platform Engineering and Analytics
Responsibilities:
- Successfully implemented Customer Profile Services and Compliance Analytics Platforms from scratch.
- Built the Enterprise Data Platform in Big Data stack to enable Customer 360 degree view in near Real Time.
- Established the roadmap and Strategy to integrate customer profile in the Big Data Technology stack with Data Quality, MDM and Search capabilities.
- Led the Architecture and design for the end to end solution using both batch and real time components.
- Implemented Micro services based architecture for the ingestion and extraction of data.
- Implemented Real time streaming of customer interactions data using Kafka.
- Led the Solution Architecture and Data Architecture to model and define data structures in NoSQL - Cassandra.
- Architected the migration of strategy of customer Profiles from DB2 Mainframe to Cassandra and benefitted 30% savings in MF licensing costs.
- Predictive models results were fed to campaign systems for targeted marketing which resulted in significant revenue increase.
- Worked together with business partners, senior management, Infrastructure and various stake holders to deliver scalable and robust solutions.
- Built and led high performance Engineering teams and collaborated with various cross-functional partners.
- Implemented the entire Data Management Life Cycle
- Evangelized the capabilities of the new platform and the technology stack across the Enterprise.
- Responsible for providing reports to the Federal regulatory agencies and internal Business partners and Predictive analytics on Risk and customer behavior.
- Demonstrated leadership of a high performance cross functional member staff team on-shore and off-shore.
- Implemented NoSQL DB - Cassandra for capturing near Real time streaming data leveraging Kafka and Spark streaming.
- Successfully built Operational Analytics platform for various log data using Splunk, R
- Implemented Data Quality and Data Governance.
Environment: AWS, Cloudera EDH, Hadoop, Yarn, Sqoop, Hive, Impala, Pig, Cassandra, Talend 5.6, Tableau, Kerberos, Kafka, Spark Streaming, Python, Micro Services, REST API, R, Shiny
Confidential
Managing Director - Big Data Platform
Responsibilities:
- Built the Enterprise Data Lake with multitenant model on Hadoop from ground up.
- Evangelized the shared Data platform and the open source technology capabilities across the Enterprise.
- In-depth knowledge in Big Data solutions and Hadoop ecosystem.
- Implemented the Data Analytics Platform and the Data Services layer.
- Implemented agile methodology for development efforts.
- Worked with multiple vendor feeds in the FX domain.
- Documented various Architectural Design Patterns.
- Evaluated various technologies and conducted POC in the Big Data Ecosystem.
- Established a strategy for data archival leveraging Big Data ecosystem.
- Designed the Data Lake Architecture as a centralized Data Hub to deliver data on demand to downstream applications. This is very effective in terms of storage cost and data access.
- Implemented Data Governance Strategy for the Data Lake.
- Managed global team of resources/developers both onsite and off-shore.
- Built a team of developers, Team Leads, business analyst from ground up and established a sustainable COE model and successfully implemented various Strategic initiatives.
- Architected and executed the successful completion of the FX execution Quality Analysis to identify various Trade patterns and performed comparative analysis against various Tick level data.
- Led the Global Markets - FX Valuation Rate Analysis to provide a self-serve ad-hoc tool to slice and dice the FX data which provides various insights such as Net profit/loss, Total volume, Rates comparison etc.
- Executed Information Security Analytics initiative to detect anomalies, to get insight into various logs such as Proxy, Active Directory, Email etc.
- Partnered with the Business leaders in successfully executing strategic initiatives. And worked with the Chief Data Officer to align with the Enterprise Data Strategies.
- Mentored various members of my team, cross-team mentoring and college interns.
Environment: Hortoworks Data Platform 1.2/2.1.2, Hadoop, Hive, NoSQL - HBase, Flume, Talend, 5.4, Teradata, Vertica, Infobright, Tableau, Pentaho, Platfora, Splunk, Kafka, Storm, Python, Java, R
Confidential
VP ETL Architecture
Responsibilities:
- Architected and built the first near Real Time Custody Information warehouse in the Finance Industry. Received the best implementation .
- Built Enterprise shared platform for Data Integration which resulted in benefitting Millions of licensing costs.
- Provided Architecture guidance & built the body of knowledge around PowerCenter, Change Data Capture (CDC), Power Exchange (Oracle, DB2 zos), Data Transformation (B2BDT).
- Designed and Implemented Enterprise Shared Services and Center of Excellence for Data Integration from ground up.
- Created and evaluated critical information from multiple sources (ODS), reconcile conflicts, decompose information into Star Schema details for Informatica ETL solutions practice
- Built ETL Governance framework with prescriptive guidance & best practices for standardized & repeatable enterprise data integration patterns.
- Promoted ETL shared service for various projects within the Enterprise
- Researched Emerging technologies in the Data Integration domain to address various Business needs within the Enterprise.
- Evaluated and Implemented Pentaho Data Integration Shared service and COE as an alternate ETL solution to reduce cost for Tier 2 applications.
- Managed a highly performing team, attracted top talent, and actively supported innovative, faster and cost effective solutions.
- Worked as a Lead liaison between business and IT.
- Provided Architecture Guidance, Technology Leadership, best practices, High Level Design, Detailed Design, Led development efforts, SWAT support, COE services and successfully implemented the below Projects
Environment: Unix, ORACLE 11G, Exadata, Sybase, Informatica, Shell scripts, Data Transformation 8.6, PWX for DB2 Zos, PWX for Oracle, JMS, MQ, Pentaho, Business Objects
Confidential, NY
ETL/Data Architect
Responsibilities:
- Designed Data Architecture and Built the Data Warehouse for CAI using star Schema.
- Lead a team of developers, Designed, implemented and managed centralized ETL Architecture for the Enterprise ETL shared service (Informatica).
- Developed a migration strategy and successfully implemented the java to Informatica conversion project and achieved excellent performance for loading the Data warehouse.
- Evangelized the ETL Shared Service across CAI to reduce cost by leveraging for multiple applications.
- Set the standards and policies for the division in consultation with overall organization, maintain and support the data warehouse. The responsibilities also include, create and implement disaster recovery strategies and testing, operational support strategies etc.
Environment: UNIX Solaris 10, ORACLE 9i, Sybase, Informatica, Shell scripts, Business Objects 11
Confidential, NJ
ETL/Data Architect
Responsibilities:
- Successfully designed and implemented centralized ETL Architecture as a shared service for the entire firm globally using Informatica and saved licensing costs of around 5 Million.
- Designed Data Architecture and data models for the Data Warehouse.
- Setting up the standards and best practices for ETL, EAI, EII developers.
- Designed and implemented cost effective centralized EAI (Tibco Business Works)/ EII (Composite), Metadata Management Shared Service Architecture for the entire firm globally and saved 3 million licensing costs.
- Implemented the Metadata Manager and put together governance model
Environment: Linux, Windows NT, ORACLE 9i, SQL*Plus, PL/SQL, DB2 8.2, Sybase, Informatica - PowerCenter 7.1.3, Shell scripts, Hummingbird, SQL*Loader, Tibco Business Works 5.2 (EAI), JMS/Tibco Messaging, Composite 3.2 (EII), MetaIntegration for Metadata Management
Confidential, NY
ETL Developer
Responsibilities:
- Worked as ETL Developer and implemented Customer Campaign and Risk Management warehouse.
- Designed the Data warehouse using Kimball’s Star Schema methodology.
- Analyzed sources, developed simple and re-usable transformation mappings in Informatica Designer.
- Extensively worked on Trillium to standardize Matching Names and address and cleanse data.
- Worked on SAS 9 to generate reports and data Analysis.
Environment: Unix, ORACLE 9i, DB2 8.1, Informatica - PowerCenter 6.2, TOAD, Unix Shell scripts, Hummingbird, SQL*Loader, ERWIN, Trillium 6 for Data Quality - Name and Address Standardization, SAS