We provide IT Staff Augmentation Services!

Sr. Big Data Architect Resume

0/5 (Submit Your Rating)

Mountain View, CA

SUMMARY

  • Experience: Over 18 Years of experience in IT and providing most of the industrial business solutions to the client’s eDW/BI requirements with various roles and responsibilities such as Sr. EA/Data/DW/BigData Architect/Lead/Manager (Data, ETL, Data Quality, Master Data Management MDM, Big Data, Machine Learning, Data Science).
  • Core Competency: Manage technical resources, on - time project deliverable, control the project budget, excellent people management skills, mentoring direct reports, strong organizational skills, identifying opportunities to provide more innovative solutions for large scale of enterprise data warehouse, design, development, implementation, Supports, technical problem solving skills to drive continuous productivity in a fast paced and hyper-growth environment to accomplish the projects deliverable.
  • Clients: Confidential, Confidential / Confidential, NetApp, Adobe system, Zillion TV, Confidential / Confidential, Cendant Hotels, Ross, Cisco, Franklin-Templeton, Santa Clara County
  • Industries: Health Care, Insurance, Finance, Hospitality, Retail, Auction, Entertainment, Software and, Internet
  • Hands on Experience
  • Project & Resource Planning, Managing resources, Scheduling, execute, implement Analytical Reporting platform in Cloud and Risk management
  • Providing Cloud Reporting platform solutions, EDW architectural solutions, Snowflake, Star (Dimensional) schema design, ETL, Data Quality, MDM, build 360 Degree business information to IT and handling very large volume of data from ODS to eDW
  • Tune Architecture (System, Logical, Physical models) design and ETL code for better performance
  • Closely work with Business Users, DBAs, Developers, and Managers to accomplish cross functional tasks
  • Using Confidential technologies such as cloud platform storage, SQL, Bigquery, Dremel, Pantheon, Clarinet, Tableau, Hadoop echo system (HDFS, HBase, Cassandra 2.0, mapreduce, Pig, Hive, Avro, Sqooq, zookeeper, oozie) to build the unstructured big data warehouse for Machine logs and Clickstream
  • Knowledge on Hibernate, spring, scala in Java, tomcat and apache servers

TECHNICAL SKILLS

Management tools: MS-Project, PowerPoint, Excel, World, Mind-map

Data Science: Statistical inference ( distribution, Confidence Intervals, Hypothesis, Group)

Machine Learning Algorithm: Regressions models (linear, Gradient Descent), Pattern Matching, Neural Network

Distributed Computing: Apache Hadoop 0.20 /1.1.1/2.0 (HDFS, Mapreduce), Pig 0.10.0, Hive 0.90, Cloudera CDH3 & 4, Hortonworks 0.20.X, 1.1.1

Streaming: Flume for log, Sqoop for relational DB, storm

Reporting Tool: Tableau 8/9.X, OBIEE 10.x, Micro Strategy 7.0, Cognos 6.X, Business Object 5.X, Crystal report

Scheduler Tool: Confidential Clarinet, IBM Control - M Enterprise, $Universe 3.0, Appworx 5.0

ETL: Informatica Power Center 8.x/7.X/6.X/5.X/4.X, RETL, RFX (10.3.3, 11.1), Data Stage 7.0, OWB 9.0, Pantaho

Programming Languages: Java, Python, J2EE, Visual Basic 6.X/5.X/4.X, C, C++, HTML

Scripting Languages: Perl, Shell (kron, base), JavaScript

IDE Tool: Eclipse SDK Luna (4.4.2)

Data Warehousing Products: RETEK / RDW 10.2, RETEK / DWI. & SAP BW

ERP: SAP NetWeaver® 7.0 R/3 & BW, Oracle 11i Application, SAP CRM

Data Modular Tools: IBM Power Designer 15.X, Erwin 3.5.2, 4.0 & 7.1, Designer 2000. ER Studio 9.0, Microsoft Visio

DBA Tools: OEM 2.0, Tuning Pack and Performance Manager, DBArtisan, TOAD.

DB Utilities: SQL*Loader, Import, Export, OEM 2.0 Tuning, Capacity Planning

DB Tools: PL/SQL, SQL*Plus, TOAD, SQL Navigator Pro, SQL*Worksheet

E-Business: ORACLE 11i Application ( ERP, CRM, HRMS )

Web Servers: BEA Weblogic 5.1, 6.0, IIS 4.0

Data Quality Tool: Trillium 5.X/6.X/7.X/11.X, First Logic 5.0

Network Tech: Server/Client configuration using SQL*Net and NET8.

Relational(Row) Database: ORACLE 12g, 11g, 10g, 9i (9.2), 8i (8.1.7, 8.1.6), 8.x, 7.x (7.3, 7.1), Teradata V2R5.1, V2R6.0, Greenplumn 3.1, MS-SQL 2005

Relational (Column) Database: Confidential BigQuery, Hbase

NoSQL Database: Confidential BigTable, Cassendra 2.0

Distributed File Systems: Confidential CNS, GFS, Hadoop HDFS

Operating Systems: Windows NT 4.0 / 2000 Servers and Workstation, Windows 95, 2000, XP Sun SPARC Solaris 8, HP-UX 11, Windows 2000/NT/98, SCO-Unix, Red-hat Linux, NCR-5500H 2 Xeon 2.66 ghz (16 GB RAM, 64 Nodes, 1152 AMP, 1Peta RAID 1 disk array)

Hardware: SUN E6500, E5500, E4500 with 4-8 CPU’s, HP V2600, HP Vectra, DRS 6000, Compaq Proliant DL580, Compaq Proliant 400, Confidential Borg

PROFESSIONAL EXPERIENCE

Sr. Big Data Architect

Confidential, Mountain View, CA

Responsibilities:

  • Manage a team of members, 120+ engagements, more than 2 Million dollars worth of tableau desktop licenses, tackle conflicts and issues on the dynamic agile environment, weekly team meeting as well as in-and-out of box Corp-Engineering Report Platform solution review meetings.
  • Define CERP SDLC Phases and Processes, Review referential architecture, processes and other artifacts, facilitated to roll out CERP self-service reporting platform (PaaS) for the various business functional verticals such as Enterprise Resources ( Global Business - Sales and Marketing, Supply Chain, Finance, HR, Legal, Treasury), Science and Technologies (YouTube, Netlap, gFiber, Technical Infrastructure, Confidential +, Android )
  • Mentoring and providing analytical solutions to the users, worked with various teams such as SRE, CE Data Ops, Cloud Platform teams (Helix, BigQuery), BizApps, CE-Engineering, etc., to stabilize the CERP platform
  • Facilitate to integrate R with CERP platform to provide solution for machine learning (regression, pattern matching) and data science for business use cases (pricing and prediction model) such as regression, predictive model, statistics inference .
  • Regulated data privacy, protection and security clearance for user data on CERP platform and integrate groups ( Confidential, ldap, Autorole and AD) to manage CERP Stack components.
  • Design ETL pipelines, tableau dashboard to forecast engagement growth, license distributions, sales forecast
  • Automated few processes such as tableau license data extracts, tableau server upgrade user test cases by using Java, Python, Selenium, Eclipse
  • Utilize Confidential cloud storage, SQL, BigQuery, Dremel/SQL, computing, analytics, File systems (CNS, GFS), Borg Software Infrastructure and machines (Borglet cells) resources for various business users use cases reporting solutions.

Confidential

Responsibilities:

  • As a manager, my responsible are IT Analysis/ Analytical Thinking, Business Acumen, Change Management, Decision Making, Prioritization frameworks, Enterprise Perspective, Strategic Alignment, Team Effectiveness
  • Cost Management, Planning & Prioritization, PM Process Knowledge, Risk ID & Management, Time and Productivity, Management Business Networking
  • Provide Architectural road map to build the next generation of near-time data warehouse for Confidential Inc.,
  • Coaching Peers, Conflict Management, Cultural Adaptability, Virtual Remote Teaming
  • Provide road-map for 500K to more than 5 Million Project plans and budget builds the technical team with around 15 to 20 On-Site and Offshore resources (DBA, BA, ETL Developers, BI Reporting Developers and QA) matrix and cross function team to accomplish projects deliverable on time.
  • Extensively involved to build the talent pool (team), Project and Resource planning, execution, implementation and deliver of Enterprises Data Warehouse projects to the business stakeholders by using various SDLC methodologies Agile, SCRUM, Water Fall.
  • Provide road map for the EDW architectural methodology, Data Reconciliation solution design, Source acquisition design process from various source system and Business intelligent solutions.
  • Implementation of following data marts HR, Sales, Finance, CRM, AutoSuppot, Weblog, ClickStream
  • Guided Logical & Physical data modeling of EDW, forward and reverse engineering of schema, projection of Volumetric analysis of EDW schema, design ER diagrams for all the subject areas by using Erwin and UML diagrams for workflow design.
  • Using Hadoop echo system ( HDFS, MapReduce, HBase, zookeeper, Pig, Hive, Avro, Cassandra, Flume, Sqoop ) to build the unstructured big data warehouse (AutoSupport, web logs, Clickstream) for data mining, word searching, Pattern matching for predictive analytical model to the following clients Confidential corp, Netapp, Adobe, Confidential .
  • Coordinate Business Users, DBAs, Developers, Managers and Directors to accomplish cross-functional tasks.

Principle eDW Architect Data, ETL, Data Quality

Confidential

Responsibility:

  • Gather business requirement from BA, BU and convert into conceptual, logical, physical model of star or snowflake schema for several Data Mart such as ( SAR, FRAUD, Customer Service, Seller Resumption, Customer First Event, Risk ( Acceptance user police, Machine Identification, Advanced Risk Science ), Call Credit.
  • Provided ETL end to end solution oriented architecture for the Click stream Data warehouse.
  • Initial study, gathering requirement, analysis, design, Development of QC process for data integrity of data warehouse and redesign ETL batch with standards.
  • Establishes standards, guidelines, data quality for ETL (Heterogeneous Source Acquisition, Stage load), Quality Control Process and so on.
  • Develop Trillium Batch Projects for 170 international countries including US, CA, UK, DE, FR, IT, MX, AU etc.,
  • Design Universal Cleansing Adaptor with router, Parser, Geocoder, and Re-constructor for integrating all international countries in informatica and mapplet, mapping, worklet, workflow and AEP (Advance External Procedure) for winkey, Parser / Geocoder, Matcher for consumer profile (Name & Address Data).
  • Experience with extract, transform and load coding (ETL), master data management for all data marts as well as best practices for common ETL design techniques such change data capture, key generation and optimization for several data marts such as SAR, FRAUD, Customer Service, Seller Resumption, Customer First Event, Risk ( Acceptance user police), Machine Identification, Advanced Risk Science .
  • Used Teradata fast load, fast export, tpump, mload utilities to load bulk data into tables and export data from table as well.
  • Performance tuning of SQL, ETL process and system performance monitoring ( ETL servers, Teradata, Oracle )
  • Monitored the operating system response in terms of CPU usage, disk space, swap space and paging by using various UNIX utilities like SAR, VMSTAT and top.

Oracle DBA, Lead Informatica Admin, Data Modeler, Trillium consultant

Confidential, San Jaso Jose, CA

Responsibilities:

  • Data modeling for several data marts such as funds, fraud, risk by using Oracle designer 2000.
  • Installation, configuration, upgrade from 8.X to 9.X (REPO, AP ), backup, recovery, cloning of Oracle Databases.
  • Installation, configuration, upgrade from 5.1 to 6.2, backup, recovery, folder & mappings migration from DEV, QA, PROD environment, users, securities, permissions for Informatica 5.1 and 6.2 .
  • Developed more complex source extraction mappings for pre-staging, staging, core of dimension and fact population. It extracts ODS data with various sources like oracle, SQL server, flat file, XML from several data mart and load into the GWS EDW pre-staging area.
  • Created PL/SQL stored procedures, triggers, mat-view, and packages for EDW.
  • Create sessions, batches, and email variables through informatica Server manager.
  • 24 * 7 on call support of Databases (7.x, 8.x, 8i, 9i) on Sun Solaris with HA Veritas Cluster Server and Quick I/O for Raw Devices & Production Informatica (ETL) Load. Performance tuning involving, SQL, Memory, Disk I/O.
  • Wrote shell scripts for monitoring CPU usage, Memory, disk space, swap space, paging by using various UNIX utilities like SAR, VMSTAT and top, to automate the process of analyzing, gathering statistical data, clearing the alert logs, trace dump files and setting CRON jobs for all.

We'd love your feedback!