Cloud Solution Architect/Consultant Resume

SUMMARY:

10+ years of professional experience in IT with 6 years as Cloud, Analytics and Big Data Architect/Engineer
GCP Certified Professional Cloud Engineer
Proficiency in all cloud service models like SaaS, PaaS, IaaS
5+ years of experience in architecting and engineering data pipelines and data lakes
4+ years experience in creating ETL mappings and scripts and creating BI/Reporting dashboards
3+ years experience with Hadoop ecosystem technologies like BigSQL, Spark, Zeppelin, YARN, HDFS, MapReduce, YARN, Hive, Pig, Ambari Infra, Ambari, ZooKeeper, Sqoop, NiFi, Flume, Kafka, Ranger, KNOX, Kerberos
4+ experience in architecting, deploying and developing Log search/aggregation/analytics solutions using Elastic Stack, Splunk and Apache Solr
1.5+ years experience in architecting the analytics and developing solutions on Azure and Google clouds technologies, GCP Fundamentals: Core Infrastructure, Coursera certified.
5+ years experience in conceptual, logical and physical data modeling and database development using both SQL and NoSQL database solutions
Experienced in handling structured, semi - structured and unstructured data
Experience in creating UNIX bash/shell scripts
Architecting enterprise applications using BDAT (Business, Data, Application and Technology) principles based on TOGAF framework
Experience leveraging Analytics and Data Science in Insurance, Banking, Entertainment, Digital Transformation and Ecommerce initiatives
Experience in Requirement Analysis, test execution, Change Management, Defect and Incident Management
Leading, providing mentorship to the teams of various sizes. Dealing with cross BU/LOB teams for various project initiatives
Effective planning and organizational skills with the ability to adapt to change and perform effectively under pressure
Excellent analytical, problem-solving, and decision-making skills, verbal and written communication skills, interpersonal and negotiation skills.

TECHNICAL SKILLS:

BI/Reporting: Tableau, Cognos, Kibana, Google Analytics, Google DataStudio

ETL: HIVE, Spark, Logstash, BigIntegrate/DataStage, Talend, MS Excel, DataProc, DataFlow, Cloud Functions, BigQuery

Hadoop EcoSystem: Hortonworks HDP, IBM BigInsights, HDFS, Hive, Pig, Kafka, Flume, Sqoop, Elasticsearch, Zookeeper, Spark, Spark SQL, Spark Streaming, Zeppelin, Ambari, Ranger, Knox, Kerberos, NiFi, HDF, BigSQL

Databases: Cassandra, Redis DB, Oracle 11g/10g, MySQL 5, MariaDB, MSSQL Server 2012/2008, SQL, PL/SQL, Oracle SQL developer, SQL* Loader, Toad

Solution Design/Modelling: MS Visio, MySQL Workbench, Oracle SQL Developer

Languages: Scala, Java, Groovy on Grails, UNIX Shell Scripting, Ruby on Rails, C#, Nodejs

DevOps Software: Gitlab, Github, Eclipse, Redmine, Basecamp, Toad, JIRA, Confluence, BitBucket, Sharepoint, Ansible, Jenkins, Bamboo, Elastic Stack, Splunk, Docker, Sourcetree, TortoiseGit, SonarQube

Operating Systems: UNIX/Linux (Redhat, CentOS, Ubuntu, Debian), Windows 10/7

Cloud Technologies: Google GCP, Microsoft Azure

PROFESSIONAL EXPERIENCE:

Confidential

Cloud Solution Architect/Consultant

Responsibilities:

Architected the Scotia Rewards project to migrate the EDL data to GCP, using as much as possible the existing work in HiveQL’s (data integration) to the Google Dataproc cluster for historical and recurring/delta loads. Using as much as the lift and shift architecture pattern.
Present and persuade the solution/design architecture to the various stakeholders e.g. LOB team, ISO, Enterprise Architecture, Compliance teams and others
Designed and developed event triggered data pipeline based on Cloud Pubsub for ingestion of the PII and non-PII data to the landing area on the Google Cloud Storage (GCS) buckets
Extraction of the current EDL data using HiveQL’s and stored into the Edge node
Helped in Diyotta configuration (ETL tool) to ingest the data from Edge node to Staging area of GCP using Cloud KMS DEK and KEK envelope encryption.
POCed with Google DLP API to detect various sensitive InfoTypes with simulated real bank data
Developed the Datastudio dashboards for Operations and Executive groups using the BigQuery events dataset to monitor the data pipeline and daily loads
Developed and deployed Dataflow jobs to write events data from Pubsub to BigQuery and from Pubsub to Pubsub
Code promotions with the bank’s CI/CD pipelines (Accelerator) using Jenkins and Bitbucket
Deployments of the Artifactory builds using bank’s Impeller pipeline templates and used Maestro for GCP project level configurations
Deployed Topics, publishers, subscribers, Cloud Functions, Dataflow jobs, IAM policies, BigQuery datasets, Storage Buckets using the bank’s Impeller pipeline (Infrastructure as Code) (Google Deployment Manager)
Defining and provisioning Cloud IAM policies and roles for the service accounts by different GCP components
Developed operational system and application level alerts and dashboards using Stackdriver logging, monitoring and error reporting
Part of active Agile/Scrum environment for project executions

Technical Environment: Lucidchart, GCP (Cloud Storage, Pubsub, BigQuery, DataStudio, Cloud Functions, DLP API, Deployment Manager, Stackdriver Logging and Monitoring, Dataproc, Dataflow,), HiveQL, Nodejs, Jenkins, SonarQube, Fortify, Bitbucket, JIRA, Confluence, Slack, Artifactory

Confidential

Big Data Architect/Engineer

Responsibilities:

Provide overall architect responsibilities including roadmaps, leadership, planning, technical innovation, security, data governance etc.
Present and persuade the solution/design architecture to the various stakeholders e.g. Data, ISO, Enterprise Architecture, Compliance teams and others
Provide technical and process leadership for projects, defining and documenting information integrations between systems and aligning project goals with reference architecture
Architected the Zoning architecture for the Enterprise Data Lake (EDL) solution leveraging the information classification standards and Identity and Access control (IAM) standards of the enterprise
Extraction of source data from multiple legacy applications and heterogeneous technologies such as mainframes z/OS, mainframe DB2 LUW based, SQL Servers files into HDFS Ingestion Zone using the Sqoop, DataStage jobs and Spark jobs
Provide technical leadership and governance to the big data team and the implementation of the solution architecture of the Hadoop ecosystem BigInsights, MapReduce, BigSQL, Pig, Hive, HCatalog, Spark, HBase, Storm, Kafka, Flume, HDFS, Oozie, Ambari, Ranger, KNOX, Kerberos.
Compared the Google GCP , Amazon AWS and Microsoft Azure for the Data Lake and Analytics machine learning toolset migration
Worked on the planning and capacity requirements for the migration path of IBM BigInsights (on-prem) solution to cloud native GCP based solution. This involved tools like DataProc, DataFlow, Cloud Functions, Google Cloud Storage and Pub/Sub.
Defining our Cloud IAM policies and roles for different GCP components
Architected the Data integration (acquisition, ingestion and publication/consumption) patterns, security, naming standards for the solution in co-operation with the ETL and BI architects
Lead the hardening the security for the Data Lake solution using Ranger KMS encryption for the Data at Rest and encrypted communication for Data in Transit
Lead the access provisioning activity for Ranger KNOX, HDFS, YARN, Hive and BigSQL policies based on the Zoning Architecture
Integrated the Data Lake Solution with enterprise offered solutions for Security Event & Incident Management (SEIM), Logging Solution (Splunk), PAM solution (CyberArk), Identity management solution (ITIM) etc.
Facilitated and participated in different ISO activities of doing auditing, penetration and vulnerability scan, firewall audit, AD groups/service/user account scans for the data lake.
Guided the team on using these tools, big data best practices and helping in deployments

Technical Environment: IBM BigInsights v4.2.5, IBM IIS BigIntegrate/DataStage 11.5, Information Governance Catalog, (IGC), Spark 2.1.0, Hadoop 2.7, Ambari, Solr, HDFS, Ranger, KNOX, Kerberos, Splunk, YARN, Spark Thrift Server, BigSQL, Cognos, ERWin, Tableau, MS Visio, GCP DataProc, DataFlow, PubSub

Confidential

Big Data Architect/Engineer/Developer

Responsibilities:

Architected the whole solution and data pipeline for the project
Deploy the HDFS, YARN, Spark and Zeppelin clustered environment on the TD Cloud
Deploy the Cassandra cluster environment on the TD Cloud
Development to do on the fly Value at Risk (VAR) calculation for the different risk models and risk factors for calculating the risk across portfolio’s and different hierarchy levels
Extraction of source files (.csv) files into Cassandra cluster using the batch Spark job
Develop the Spark code using the Spark Data Source, Dataset/Dataframe and RDD API
Develop the Spark ETL batch jobs to load the data from Cassandra, to the transformations and aggregations and load partially pre-aggregated views back to Cassandra
Connecting Spark to the Cassandra cluster using Spark Cassandra Connector from Datastax
Deployed the Spark Thrift Server for connecting to different JDBC/ODBC based driver for visualizations and analytics with in-memory Spark Context data structures
Used Angular 2 for doing the front-end development for visualizations and analytics, calling the REST based web service
Deployed and tested Livy REST API for submitting batch and interactive Spark jobs
Developed some POC work on the Spark Structured Streaming API
Maintaining and monitoring the clustered environment for Spark Standalone Cluster and Cassandra Ring
Use the Scala SBT to build the Uber Jar’s
Guide the team on using these tools and helping in deployments

Technical Environment: Spark 2.1.0, Scala 2.11, Hadoop 2.7, HDFS, YARN, Zeppelin Notebook 0.7.1, Spark Thrift Server, Cassandra 3.0.13, Angular 2, MSSQL Server, SQL Server Management Studio, MS Visio, IntelliJ Idea, SBT, Spark Job Server, Livy REST Spark API, Rackspace Cloud

Confidential

Data Architect/Modeller

Responsibilities:

Gathering the requirement from the client on the current state of the models i.e. database schemas for the L1, L2 and L3 stages of the database
Analysing the current state of the schema and processes in place for the data warehousing project
Designing the data warehouse models and relationships in MS Visio
Documenting the best practices of the data warehouse models and processes involved
Analysing and comparing the fact and dimension tables for the best practices and doing gap analysis
Analysing the source raw file formats and naming conventions with the best practices
Documenting the recommendations for the schema and file naming standards
Analysing and suggesting the change management and release lifecycle of the dimension tables

Technical Environment|: MSSQL Server, SQL Server Management Studio, MS Visio, MS Excel, MS PowerPoint

Confidential

Big Data Architect/Engineer

Responsibilities:

As a SME on this project, designing the Business Intelligence requirements for the application logging, monitoring, troubleshooting and later data analytics
Architected the whole solution for the project with Hadoop and Elastic Stack
Deployed the Development and Performance/UAT environments for Elastic Stack and Hortonworks Data Platform (HDP)
Experience in developing, testing, debugging components to support a Data Lake
Designed the detailed Data Lake Models for Data Ingestion in Microsoft Visio.
Implemented data collection, loading, data QA, data cleansing, enhancing, data transformation pipeline
Develop ETL scripts in Logstash and loaded data into highly distributed Kafka topics from which data was consumed in both Elasticsearch (short term) and HDFS (long term) data stores
Deployed and clustered environment for Kafka and created topics.
Develop mappings to extract data from different sources such as OS, database, middleware and custom application logs load into data warehouse
Developed Flume job to pull data from Kafka Queue to dump it into HDFS storage
Develop Hive QL to load data in real time into Hive (external and internal) tables from HDFS
Developed queries in HQL to do transformations, joins and aggregations on the log data to make it better fit BI tool
Used Spark to load log data from HDFS into RDD’s to further data analytics
Used various Spark interpreters to do write Scala based transformation and actions on the JSON based log data
Used Sqoop to pull the data from Oracle DBMS into HDFS storage
Extensively worked on Hive tables, partitions and buckets for analyzing large volumes of data
Develop Business Intelligence dashboards and reports for the key performance indicators
Interacting with the end users on a regular basis for resolving issues pertaining to the reports and data enrichment and cleansing
Involved in the continuous enhancements and fixing of production issues.
Developed the onboarding process to onboard different groups across the bank on this project
Lead a team of 4-5 people for the project and mentoring/educating for the best practices
Planned the RBAC model for the project for correct authentication and authorization
Created System Build Guides for the Production deployments
Worked in Agile and Scrum environment

Technical Environment: HDP 2.4, HDFS, Spark, Scala, Hive, Pig, Kafka, Java, Logstash 2.2, Elasticsearch 2.2.1, Kibana 4.4, Shield, Marvel, Watcher, Beats, Filebeat, Topbeat, Docker, CentOS, RedHat 6, BitBucket, Jira, Confluence, Microsoft Sharepoint, ZooKeeper, JDBC

Confidential

Big Data Architect/Engineer

Responsibilities:

Architected the whole solution and data pipeline for the project
Deploy the BI and ETL system with ELK Stack for decision making
Implemented data collection, data cleansing, enhancing, data transformation, data QA and loading pipeline
Develop mappings to extract data from different sources such as Twitter, Oracle, MySQL, CSV/Flat files and logs and load into Elasticsearch and Hadoop backend
Develop Kafka topics and to fetch the event logs into Logstash
Develop Hive QL to load data in real time into Hive (external and internal) tables from Elasticsearch
Developed queries in HQL to do transformations, joins and aggregations on the data
Develop Histograms, line charts, drill through, master-detail, chart and complex reports which involved multiple filters, multi-page, multi-query reports against multiple databases
Develop BI dashboards and reports for the key performance indicators in Kibana
Developed the REST bases API’s for different museum exhibits in Groovy on Grails framework.
Interacting with the End users on a regular basis for resolving issues pertaining to the reports
Modifying the existing reports based upon the change request by the user
Designing Conceptual, Logical, and Physical data models and developing database schema supporting the API development for the exhibits and the mobile program
Implement various performance tuning techniques on mappings and database queries
Developing and testing database backup and restore plans in coordination with IT
Ensuring that backup and recovery procedures are functioning correctly
Created UNIX shell scripts to automate the jobs for local and remote deployments, database sanitizations and production systems database backups and restores
Involved in the continuous enhancements and fixing of production problems
Writing database and analytics documentation
Controlling access permission and privileges for different applications across organization
Documenting and educating other team members on database design and usage
Provides support for application development teams including mentoring on best practices for database usage, and by participating in code walkthroughs

Technical Environment: Java, Groovy on Grails, HDP 2.3, Kafka, Hive, ZooKeeper, Logstash 2.1, Elasticsearch 2.1, Kibana 4.2, Beats, Filebeat, MySQL Server 5.6, MySQL Designer, TOAD, SQL*Loader, Tomcat 6, CentOS, RedHat 6, Gitlab, Redmine, Basecamp

Confidential

Analytics/BI Developer

Responsibilities:

Architected the whole solution for Business Intelligence/ETL platform for mobile based video games, which tracks the game play metrics for all genres of games and visualize these in real time dashboards
Implemented data collection and load, data QA, data cleansing and enhancing, data transformation
Designed and developed the data collection REST based API which embeds itself into the game and is easily deployable into the game with the use of plugin. Records both online and offline game metrics
Develop ETL mappings and loaded data into Data Warehouse
Develop mappings to extract data from different sources such as MySQL and logs and load into the target
Designed and managed the backend for storing the processed data in Redis DB for collecting and aggregating stats
Developed visualizations and reports for the real time game play metrics and aggregations
Developed BI dashboards for the key performance indicators
Develop Histograms, line charts, pie charts, drill through, master-detail, chart and complex reports which involved multiple filters, multi-page, multi-query reports against multiple databases. Used filters for efficient data retrieval
Created UNIX shell scripts to automate the jobs for local and remote deployments and production systems database backups and restores
Assisted in designing Logical and Physical Data Models for the game
Involved in the continuous enhancements and fixing of production problems
Documenting and educating other team members on database design and usage
Writing database and analytics documentation

Technical Environment: MySQL 5.1, ERWin, Redis DB, C# .NET, Ruby on Rails, Visual Studio 2010, Basecamp, Git

Confidential

Database Developer

Responsibilities:

Heavily involved in database conceptual and physical modeling , data restoration, maintenance, backups, security and troubleshooting performance issues
Work with application developers and other database administration staff to research, define, and correct application data related issues
Heavily involved in gathering user requirement, analysis, designing, coding, testing and implementation
Creation of database objects like tables, indexes, views, materialized views, procedures and packages
Involved in the continuous enhancements and fixing of production problems
Bug fixing for existing web applications

Technical Environment: Oracle 10g, MSSQL Server 2012, MySQL 5.1, MySQL Workbench, MySQL Designer, TOAD, SQL*Loader, SQL Developer, SQL* Plus, Cent OS

Confidential

Database Developer

Responsibilities:

Involved in database modelling, development, query optimization and administration
Automated different weekly and monthly reports using UNIX shell and PHP scripting
Designed logical and physical data models using Erwin
Wrote sequences for automatic generation of unique keys to support primary and foreign key constraints in data conversions
Worked on SQL*Loader to load data from flat files obtained from various facilities every day
Creation of database objects like tables, views, materialized views, procedures and packages using oracle tools like Toad and SQL* plus
Creating indexes on tables to improve the performance by eliminating the full table scans and views for hiding the actual tables and to eliminate the complexity of the large queries
Involved in the continuous enhancements and fixing of production problems

Technical Environment: Oracle 10g, MSSQL Server 2008, TOAD, SQL*Loader, SQL Developer, SQL* Plus, ERWin, VB .NET, PHP, Cent OS

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship