We provide IT Staff Augmentation Services!

Senior Big Data Architect Resume

3.00/5 (Submit Your Rating)

New, YorK

SUMMARY:

  • Seeking a position as a Sr. Cloud BigData Architect with an opportunity to architect & design Big Data solutions on Confidential AWS.

TECHNICAL SKILLS:

Operating Systems: IOS, Linux, Windows.

Tools: Apache Kafka, Confidential EMR, CDH4.2, Hadoop, Oozie, Flume, Sqoop, Hue, IntelliJ, Eclipse, GIT, BitBucket, Source Tree, JIRA, RAD7, WAS6, IBM Heap Analyzer, JMeter, Visio, Rational Rose, Clear Case, Clear Quest, Synergy, SVN.

Languages: Kafka Connect, Kafka Streams, Spark Streaming, Spark ETL, Spark SQL, Java Map Reduce, PIG, Hive, Impala, Hue, UML Modelling, UseCase Analysis, Design Patterns (Core & J2EE), OOAD, Java8, J2EE (JSP, Servlets, EJB), Web Services, Ant, Python.

Frameworks: Apache Confluent, Stratio BigData, Apache Spark, Confidential AWS, Datastax Enterprise, Hadoop (MapReduce/HDFS), Spring3.0, Hibernate, Struts, Kerberos.

Data Types: AVRO, PARQUET, JSON, CSV.

AWS Services: EMR, Kinesis, IAM, EC2, S3, EBS, Data Pipeline, VPC, Glacier & Redshift.

Databases: Cassandra, MongoDB, SQL Server, MySql, Oracle 9i/10g.

Data Analytics: Tableau, Mahout, RevolutionR, Talend

Domain: Medical, Lifestyle, Banking, Brokerage and Hospitality.

PROFESSIONAL EXPERIENCE:

Confidential

Senior Big Data Architect

Responsibilities:

  • Architecting, Designing and Implementing (100% hands - on) the Big Data Platform for DMD. Responsible for building the datalake in Confidential AWS, ingesting structured master data from various SQL databases and TBs of transactional data via Confidential Kinesis Firehose/Streams. All data resides in Confidential S3 buckets within the datalake in Avro/Parquet formats. As the data is being cleansed/processed it flows through various stages via a staging process ultimately getting it ready for interactive queries, operational analytics, data mining and reporting.
  • The data lake is currently being built to ingest historical data (PageView, TagDetected & EmailsOpens) and HealthCare Professional profile data using Sqoop. Future usecases being designed to ingest transactional data from AIM application via Kinesis Firehose.
  • The data moves through the various stages using Hive and all ETLS (cleansing, transformations & aggregations) are being done using Spark.
  • Performance tuning to build scalable solutions on the EMR clusters, Hive ETLs and spark jobs to process more than 1bil records (TBs data), using concepts like Memory Management, Auto Scaling, Hive-On-Spark, Partitioning, Oozie sub Workflows, etc.
  • Since the EDW is the source for all Health Care Professional data, the transactional applications will get feeds of HCP data via the HCP Sync process.
  • Technologies being used are Sqoop, Hive, Tez, Oozie, Spark SQL, Spark Streaming, Hue, Confidential S3, Data pipeline, VPC, EMR, Kinesis Firehose/Streams, Jenkins, GIT, Bit Bucket, Source Tree, JIRA.
Confidential

Senior Big Data Architect

Responsibilities:

  • Architecting, Designing and Implementing the Big Data Platform for BBVA Compass Bank. This platform is build around the use case for analyzing Marketing Campaign spends within the bank. The aim is to build the Big Data lake ingesting TBs of data from Omniture, Google/Bing, Affiliates, Bank DBs, Mainframe, etc.. The platform is built engaging frameworks like Confluent, Kafka, Schema Registry, Stratio CrossData to stream data into Hadoop HDFS and analyze using Tableau. Languages/Technologies used, Java8, Python, Kafka Stream APIs, Spark APIs, Kafka Connect, Stratio CrossData & HDFS.
Confidential

Senior Big Data Architect

Responsibilities:

  • Designing and Implementing the core-data-pipeline using Spark Streaming on EMR. This is a near real time streaming application, which enables multiple teams to ingest journal data, via consumers registered to the data-pipeline. Each stream of data flows via RabbitMQ, which is integrated to the pipelines, fans out the data to multiple heterogeneous systems, registered via their consumers. Currently the activity data is being aggregated to 20steps/day/user which translates into 400 records/sec of journal data (10’s TB/yr.), next year the intent is ingest the raw un-aggregated data (10,000 steps/day/user), which translates into 150K msgs/sec (100’s TB/yr).
  • Worked on Architecting, Designing and Implementing the GADS (Global Analytic Datastore) on Confidential AWS. This is a Big Data platform implemented using the Hybrid Big Data Architecture to hold 5-10 years of historical data from various sources (100s of Terabytes of data), which would provide a customer & product centric view of the data. This would enable the analysts to visualize customer behaviour and his journey across various products, customer retention, customer interactions, etc. This would also provide the Business Intelligence to perform predictive and statistical analysis using Big Data technologies.
  • The GADS platform was designed following the Hybrid Architecture on Confidential cloud technologies with structured data hosted on Redshift with the ETL pipeline build in Talend and semi/unstructured data would be hosted within S3 buckets using transient EMR clusters, data pipeline, Oozie, Hive, Impala and later migrated to Spark & Spark SQL. The idea was to centralize the data onto the cloud to do further analytics using Tableau and RevolutionR.
  • Implemented various Tableau visualizations to identify customer interactions, retention, bookings, engagements, etc. across various dimensions.
  • Architected and Implemented the Tableau Server architecture to distribute the Tableau dashboards across the organization.
  • Implemented the VPC architecture on Confidential AWS with the infrastructure teams to deploy instances in Dev and Prod environments. This architecture ensures all the security controls were put in place to meet the HIPAA requirements to protect PII and PHI sensitive member information.
  • Worked on integration of various datasets both into the Data warehouse and the Big Data platform, bringing in data from, DOTCOM-online data, CHAMP-meeting data, SMV-MDM data, WELLO-Coaching data, Teletech-Chat&Call data, Exact Target-Mail data, Reflexis-Workforce data, ClickTools-Satisfaction data, etc.
Confidential, New York

Senior Big Data Architect

Responsibilities:

  • Working for Fraud Technologies architecting, designing and implementing a big data solution for fraud detection and analytics. This product called HULC is intended to hold 13 months of historical data from various sources (100s of Terabytes of data), which would provide a consolidated view of the customer’s products across the bank. This would provide the business analysts the business intelligence to perform various analytics using various big data technologies.
  • The other aspect of this product called ELECTRO is to perform ETL transformation on the raw data before it would be processed for scoring and alert detection in the bank.
  • Responsible for designing the Cassandra data model for Venom, DFP & Flash projects and integrating them into the application design. Venom data model holds the monetary & non-monetary transactions, DFP holds the online login transactions & Flash holds the alerts generated by HULC.
  • Responsible for Architecting the solution, defining the integration points with the fraud-scoring engine, capacity planning, deciding key technologies and designing and implementing the solutions.
  • Responsible for introducing Big Data tools and technologies into the bank, presenting and implementing POCs for Tableau, Mahout, RevolutionR, Impala, Pentaho, etc.
  • These above mentioned projects have been implemented using various big data technologies like Cloudera Hadoop CDH4.2, Java MapReduce, Pig, Hive, Oozie, Flume, Cassandra, Sqoop and Solr.
Confidential, New York

Senior Solutions Architect

Responsibilities:

  • SME on MongoDB for the Big Data initiatives, responsible for architecting and designing big data scalable solutions for teams across the organization.
  • Being a part of the architecture group responsible for setting up Standards and Best Practices, building POCs, reviewing program level initiatives, vendor management, etc.
Confidential, New York

Technical Architect

Responsibilities:

  • Participating in the Distributed Computing Initiative using Hadoop, Hive & PIG implementations. The initiative was to build a Data Fabric platform within the organization to enable parallel computation and analysis of large files emerging from the trading desks.
  • As part of the emerging technology initiative, working on setting up the Hadoop distributed cluster on the Confidential AWS cloud and building POC for implementing Map-Reduce jobs in Java and monitoring the same using the Web UI in fully distributed mode. Also implementing PIG Latin data processing scripts for parallel processing.
  • Responsible for driving the Cloud Computing practices at Morgan Stanley. Executed a comparative study of Amazon, Azure & MS Private Cloud, building and deploying a Java & .NET application and exploring various cloud features like Elastic Computing, Cloud Storage Services, Identity & Access Management, Load Balancing & Auto Scaling, etc.
Confidential, New Jersey

Project Lead

Responsibilities:

  • Responsible for Design/Development of the Customer Activation System for on boarding Clients/Users enabling them to use the Citi group of products and services. This Admin product goes beyond the client & user on boarding with services such as Admin Agent Management, Service management, Contacts & Reports.
  • The application is developed with the UI in .NET, interacting with the Application web services developed in Java and finally being integrated with the End Systems/Applications using the Provisioning product.
  • The application has been developed in Technologies like .NET, Web Services, Spring, Hibernate, XML, JMS using tools and products like RAD7, Websphere, TIBCO, TFS, and Oracle10g.
Confidential

Project Lead

Responsibilities:

  • Responsible for the Design/Development of the Service Request Management (SRM) project whose main goal was to decommission the Remedy and migrate all it’s functionality onto the GOW platform.
  • The application was developed on the Struts framework interacting remote services developed with EJB2.1.
  • Technologies involved: Struts, EJB2.1, Oracle9i, WLI server, Eclipse.
Confidential, S FO

Project Lead

Responsibilities:

  • Responsible for the Design/Development of the Quarterly Portfolio Profile, a reporting tool to provide clients with a Quarterly Performance snapshot of assets in their accounts. Also worked with the architects in performance tuning and load testing the tool. Preparation of HLD/LLD documents. Responsible for providing the Impact Analysis, Effort estimation and Sizing.

Confidential - Florida

Senior Software Engineer

Responsibilities:

  • A Web Based System to administrate the Hospitality services of Confidential, USA. This system is basically a Sales and Booking Client for booking hotels/resorts, air tickets, tickets for theme parks etc.
  • Technology: UML, JSP, EJB, Design Patterns (Core and J2EE), SQL, Websphere, Oracle, Rational Rose, IntelliJ.

Confidential

Senior Software Engineer

Responsibilities:

  • This application was the GEPS ( Confidential ) Intranet site which was written using the J2EE architecture. It was developed on CASPER a J2EE framework developed by GE based on the MVC design model.
  • Technology: Java, JDBC, JSP, XML, HTML, JavaScript, UML, Rational Rose Oracle8i, Weblogic6.0.

Confidential

Senior Executive Projects

Responsibilities:

  • The application had been developed to customize the database objects in the database layer and the front-end layer through EJBs in the middle tier. Clients in US and India are provided their own customized version of database objects and front end look and feel in the form of JSPs which create dynamic forms. The product can make customized media management applications very easy to be delivered to clients.
  • Technology: Java, J2EE, EJB, Weblogic, Jbuilder, Oracle 8, Weblogic 5.1.

We'd love your feedback!