Bigdata Senior Analyst/architect Resume
Los, AngeleS
SUMMARY
- Professional IT Experience of 12 years, including 4 years of work experience in BigData, Hadoop solution architect, design, development and Ecosystem Analytics in Banking and Financial sectors
- Excellent understanding and knowledge of Big Data and Hadoop architecture
- Hands on experience on major components in Hadoop Ecosystem like Spark, MapReduce, HDFS, HIVE, PIG, HBase, Sqoop, Oozie, Flume, Kafka, Tableau
- Strong knowledge in using Map reduce programming model for analyzing the data, implementing custom business logic and perform join optimization, secondary, custom sorting
- Hands on experience writing complex Hive, SQL queries, PIG scripts for data analysis to meet business requirements
- Excellent understanding of NoSQL databases like HBase, Cassandra and MongoDB
- Experience in importing and exporting data using Sqoop from HDFS (Hive & HBase) to Relational Database Systems and vice - versa
- Worked on Spark SQL, Streaming and involved in Performance optimizations in Spark
- Extensive hands-on experience using Informatica Power Center 9.x/8.x/7.x (Repository Manager, Mapping Designer, Workflow Manager, Workflow Monitor,) and SAS EG
- Experience in reporting tools SAP Business Objects (IDT, UDT, Interactive Analysis and BI Launch Pad/InfoView, CMC, CCM, Dashboards, Lumira)
- Certified and Strong experience with MPP database Netezza, its architecture and underlying data distribution
- Experience in using different file formats like AVRO, Parquet, RC, Sequence and CSV
- Worked on Performance Tuning, identifying and resolving performance bottlenecks in various levels
- Able to work collaboratively, quickly understand system architecture and business needs while committed to achieving corporate goals
- Experienced in Agile/Lean and PMI project management methodologies
TECHNICAL SKILLS
Big Data/Hadoop: HDFS, Hive, Pig, Sqoop, HBase, Oozie, Spark
ETL Tools: Informatica, BODS, SSIS, SAS EG
BI Tools: SAP BusinessObjects, QlikView, Tableau
Database: Netezza, Oracle, DB2, SQL Server, MySQL, MS Access
Operating Systems: UNIX (AIX/Solaris), Linux, Windows 2008, AS400
Language: NZSQL, PL\SQL, T-SQL, Shell Script, Java, Scala
Database Tools: SQL Developer, SQL Data Modeler, SSMS, TOAD
Testing Tools: HP Quality Center, Jira, Remedy
Methodologies: PMP, Agile
PROFESSIONAL EXPERIENCE
Confidential, Los Angeles
BigData Senior Analyst/Architect
Responsibilities:
- Gathered business requirements in meetings for successful implementation and moving it to Production
- Analyse the business requirements and converting into technical documentation for easy understanding for the development team
- Job duties involved the design, development of various modules in Hadoop Big Data Platform and processing data using Spark, HBase, Sqoop, Hive
- Implemented POC to migrate MapReduce jobs into Spark RDD transformations using Scala
- Ingested huge amount of ClickStream data into Hadoop using Parquet storage format
- Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark
- Developed Spark scripts by using Scala shell commands
- Successfully loaded files to Hive and HDFS from Oracle SQL Server using SQOOP
- Worked with Oozie Workflow manager to schedule Hadoop jobs and high intensive jobs
- Extensively used Hive/HQL or Hive queries to query data in Hive Tables and Loaded data into HIVE tables
- Worked closely in setting up of different environment and updating configurations
- Introduced Tableau Visualization from hadoop to produce reports for Business and BI team
- Daily Scrum Meetings with the team for the Updated status and the action plan of the day
ENVIRONMENT: Hadoop, HDFS, HBase, Spark, Spark Streaming, SparkSQL, Pig, Hive, Sqoop, Tableau, Linux, Oracle, Shell Script, Scala
Confidential, Los Angeles
BigData Senior Analyst/Architect
Responsibilities:
- Involved in design, development of various modules in Hadoop Big Data Platform and processing data using MapReduce, Hive, Pig, Sqoop, Oozie
- Designed & implemented Java MapReduce programs to support distributed data processing
- Importing and exporting the data using Sqoop from HDFS to Oracle and vice-versa
- Developed Hive scripts to handle Avro and JSON data in Hive using Hive SerDe's
- Automated all the jobs for pulling data from FTP Server to load data into Hive tables, using Oozie workflows
- Used Oozie to automate the data loading into Hadoop Distributed File System
- Involved in implementation of Hadoop Cluster for Development and Test Environment
- Used different file formats like Text files, Sequence Files, Avro
- Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts.
ENVIRONMENT: Hadoop, MapReduce, Pig, Hive, Sqoop, Linux, Oracle, Shell Script
Confidential
Senior Project Associate
Responsibilities:
- Participated in planning sessions with offshore technical team for resource allocation and capacity planning
- The project requirement is to build a scalable and reliable Big Data platform that will be core component for forthcoming business needs
- Designed & implemented MapReduce programs to support distributed data processing
- Importing and exporting data using Sqoop from HDFS to Netezza and vice-versa
- Developed UDF functions for Hive and wrote complex queries in Hive for data analysis
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries
- Loaded and transformed large sets of structured, semi structured data using Pig Scripts
ENVIRONMENT: Hadoop, MapReduce, HBase, Hive, Pig, Sqoop, Netezza
Confidential
Senior Project Associate
Responsibilities:
- Interacting with various internal functional teams and team members to analyze business requirements and to outline the initial setup architecture and reporting solution
- Involved in Dimensional modeling (Star Schema) of the Data warehouse and created conceptual, logical, physical data models
- Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse
- Created technical ETL mapping documents to outline data flow from sources to targets
- Extracted the data from the flat files and SQL Server databases into staging area and populated onto Data warehouse
- Review existing code, lead efforts to tweak and tune the performance of existing Informatica processes
- Extensively used SQL* loader to load data from flat files to the database tables in Oracle
- Build the new universes as per the user requirements by identifying the required tables from Data Warehouse and by defining the universe connections
- Used Derived tables to create the universe for best performance, and use context and alias tables to solve the loops in Universe
- Created complex reports stating Trading Statistics per month and per year using cascading and user objects like measure objects by using @aggregate aware function to create the summarized reports
- Daily/Weekly/Monthly MAS Regulatory reporting with consolidated exchange statistics for Market Operations Dept and Clearing & Settlement Dept
ENVIRONMENT: Informatica 8.6, SAP BusinessObjects XIR3.1, Netezza, Oracle 11g, PL\SQL, SQL Server 2008, Shell/Batch/VB Script, Linux, Win 2008
Confidential
Senior Consultant
Responsibilities:
- Interacted with business analysts to identify information needs and business requirements for reports Involved in the design of Conceptual, Logical and physical data modeling of the data warehouse based on star schema methodology, Created Data Models using Erwin
- Scheduling and monitoring Informatica workflows and batches
- Maintenance of mappings and load process, Enhancements of existing mapping and developing new mappings to facilitate change requests
- Responsible for supporting, troubleshooting, Data Quality and Enhancement Issues.
- Debugging, performance tuning of mappings, session recovery and maintenance of workflow jobs in production
- Build the new universes as per the user requirements by identifying the required tables from Data mart and by defining the universe connections.
- Developed critical reports like drill down, Slice and Dice, master/detail for analysis.
- Created complex reports using the multiple data providers and synchronized the data providers by linking the common objects to create the common reports.
- Developed universe user documentation guide for the end users reference.
ENVIRONMENT: Informatica 7.1, Business Objects, SAS Base, SAS EG, Oracle 10g, PL/SQL, DB2, Shell Scripting, Clear Case, Clear Quest, HP Quality Center, Control-M, UNIX, AS400 and Win 2000
Confidential
Senior Consultant
Responsibilities:
- Developed complex mapping using Expression, Aggregate, Lookup, Sequence Generator, Update Strategy, Stored Procedure, Router and Dynamic parameters.
- Implemented slowly changing dimensions to maintain historical data using Type I Type II
- Scheduling and monitoring Informatica workflows and batches
- Responsible for supporting, troubleshooting, Data Quality and Enhancement Issues.
ENVIRONMENT: Informatica 7.1, Windows 2000, Oracle 9i, Erwin, Cognos Series 7 and Cognos Reportnet 1.1 Mr1
Confidential
Developer
Responsibilities:
- Responsible for complete life cycle of the system.
- Responsible client side requirement study, preparation of system specification, requirement specification.
- Responsible for creating application setup, installation, testing and troubleshooting
ENVIRONMENT: HP, MySQL, SQL, Java script, Linux, Apache
Confidential
Developer
Responsibilities:
- Cleaned up and rewrote much of the front end code to meet validation requirements
- Provide detailed project requirements, technical specifications as well as a step by step map for project execution
- Actively participated in all database related activity
ENVIRONMENT: PHP, MySQL, SQL, Java script, Linux, Apache