Bigdata Senior Analyst/architect Resume
Los, AngeleS
SUMMARY
- Professional IT Experience of 12 years, including 4 years of work experience in BigData, Hadoop solution architect, design, development and Ecosystem Analytics in Banking and Financial sectors
- Excellent understanding and knowledge of Big Data and Hadoop architecture
- Hands on experience on major components in Hadoop Ecosystem like Spark, MapReduce, HDFS, HIVE, PIG, HBase, Sqoop, Oozie, Flume, Kafka, Tableau
- Strong knowledge in using Map reduce programming model for analyzing the data, implementing custom business logic and perform join optimization, secondary, custom sorting
- Hands on experience writing complex Hive, SQL queries, PIG scripts for data analysis to meet business requirements
- Excellent understanding of NoSQL databases like HBase, Cassandra and MongoDB
- Experience in importing and exporting data using Sqoop from HDFS (Hive & HBase) to Relational Database Systems and vice - versa
- Worked on Spark SQL, Streaming and involved in Performance optimizations in Spark
- Extensive hands-on experience using Informatica Power Center 9.x/8.x/7.x (Repository Manager, Mapping Designer, Workflow Manager, Workflow Monitor,) and SAS EG
- Experience in reporting tools SAP Business Objects (IDT, UDT, Interactive Analysis and BI Launch Pad/InfoView, CMC, CCM, Dashboards, Lumira)
- Certified and Strong experience with MPP database Netezza, its architecture and underlying data distribution
- Experience in using different file formats like AVRO, Parquet, RC, Sequence and CSV
- Worked on Performance Tuning, identifying and resolving performance bottlenecks in various levels
- Able to work collaboratively, quickly understand system architecture and business needs while committed to achieving corporate goals
- Experienced in Agile/Lean and PMI project management methodologies
TECHNICAL SKILLS
Big Data/Hadoop: HDFS, Hive, Pig, Sqoop, HBase, Oozie, Spark
ETL Tools: Informatica, BODS, SSIS, SAS EG
BI Tools: SAP BusinessObjects, QlikView, Tableau
Database: Netezza, Oracle, DB2, SQL Server, MySQL, MS Access
Operating Systems: UNIX (AIX/Solaris), Linux, Windows 2008, AS400
Language: NZSQL, PL\SQL, T-SQL, Shell Script, Java, Scala
Database Tools: SQL Developer, SQL Data Modeler, SSMS, TOAD
Testing Tools: HP Quality Center, Jira, Remedy
Methodologies: PMP, Agile
PROFESSIONAL EXPERIENCE
Confidential, Los Angeles
BigData Senior Analyst/Architect
Responsibilities:
- Gathered business requirements in meetings for successful implementation and moving it to Production
- Analyse the business requirements and converting into technical documentation for easy understanding for the development team
- Job duties involved the design, development of various modules in Hadoop Big Data Platform and processing data using Spark, HBase, Sqoop, Hive
- Implemented POC to migrate MapReduce jobs into Spark RDD transformations using Scala
- Ingested huge amount of ClickStream data into Hadoop using Parquet storage format
- Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark
- Developed Spark scripts by using Scala shell commands
- Successfully loaded files to Hive and HDFS from Oracle SQL Server using SQOOP
- Worked with Oozie Workflow manager to schedule Hadoop jobs and high intensive jobs
- Extensively used Hive/HQL or Hive queries to query data in Hive Tables and Loaded data into HIVE tables
- Worked closely in setting up of different environment and updating configurations
- Introduced Tableau Visualization from hadoop to produce reports for Business and BI team
- Daily Scrum Meetings with the team for the Updated status and the action plan of the day
ENVIRONMENT: Hadoop, HDFS, HBase, Spark, Spark Streaming, SparkSQL, Pig, Hive, Sqoop, Tableau, Linux, Oracle, Shell Script, Scala
Confidential, Los Angeles
BigData Senior Analyst/Architect
Responsibilities:
- Involved in design, development of various modules in Hadoop Big Data Platform and processing data using MapReduce, Hive, Pig, Sqoop, Oozie
- Designed & implemented Java MapReduce programs to support distributed data processing
- Importing and exporting the data using Sqoop from HDFS to Oracle and vice-versa
- Developed Hive scripts to handle Avro and JSON data in Hive using Hive SerDe's
- Automated all the jobs for pulling data from FTP Server to load data into Hive tables, using Oozie workflows
- Used Oozie to automate the data loading into Hadoop Distributed File System
- Involved in implementation of Hadoop Cluster for Development and Test Environment
- Used different file formats like Text files, Sequence Files, Avro
- Assisted in creating and maintaining Technical documentation to launching HADOOP Clusters and even for executing Hive queries and Pig Scripts.
ENVIRONMENT: Hadoop, MapReduce, Pig, Hive, Sqoop, Linux, Oracle, Shell Script
Confidential
Senior Project Associate
Responsibilities:
- Participated in planning sessions with offshore technical team for resource allocation and capacity planning
- The project requirement is to build a scalable and reliable Big Data platform that will be core component for forthcoming business needs
- Designed & implemented MapReduce programs to support distributed data processing
- Importing and exporting data using Sqoop from HDFS to Netezza and vice-versa
- Developed UDF functions for Hive and wrote complex queries in Hive for data analysis
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries
- Loaded and transformed large sets of structured, semi structured data using Pig Scripts
ENVIRONMENT: Hadoop, MapReduce, HBase, Hive, Pig, Sqoop, Netezza
Confidential
Senior Project Associate
Responsibilities:
- Interacting with various internal functional teams and team members to analyze business requirements and to outline the initial setup architecture and reporting solution
- Involved in Dimensional modeling (Star Schema) of the Data warehouse and created conceptual, logical, physical data models
- Involved in building the ETL architecture and Source to Target mapping to load data into Data warehouse
- Created technical ETL mapping documents to outline data flow from sources to targets
- Extracted the data from the flat files and SQL Server databases into staging area and populated onto Data warehouse
- Review existing code, lead efforts to tweak and tune the performance of existing Informatica processes
- Extensively used SQL* loader to load data from flat files to the database tables in Oracle
- Build the new universes as per the user requirements by identifying the required tables from Data Warehouse and by defining the universe connections
- Used Derived tables to create the universe for best performance, and use context and alias tables to solve the loops in Universe
- Created complex reports stating Trading Statistics per month and per year using cascading and user objects like measure objects by using @aggregate aware function to create the summarized reports
- Daily/Weekly/Monthly MAS Regulatory reporting with consolidated exchange statistics for Market Operations Dept and Clearing & Settlement Dept
ENVIRONMENT: Informatica 8.6, SAP BusinessObjects XIR3.1, Netezza, Oracle 11g, PL\SQL, SQL Server 2008, Shell/Batch/VB Script, Linux, Win 2008
Confidential
Senior Consultant
Responsibilities:
- Interacted with business analysts to identify information needs and business requirements for reports Involved in the design of Conceptual
- Logical and physical data modeling of the data warehouse based on star schema methodology, Created Data Models using Erwin
- Scheduling and monitoring Informatica workflows and batches
- Maintenance of mappings and load process, Enhancements of existing mapping and developing new mappings to facilitate change requests
- Responsible for supporting, troubleshooting, Data Quality and Enhancement Issues.
- Debugging, performance tuning of mappings, session recovery and maintenance of workflow jobs in production
- Build the new universes as per the user requirements by identifying the required tables from Data mart and by defining the universe connections.
- Developed critical reports like drill down, Slice and Dice, master/detail for analysis.
- Created complex reports using the multiple data providers and synchronized the data providers by linking the common objects to create the common reports.
- Developed universe user documentation guide for the end users reference.
ENVIRONMENT: Informatica 7.1, Business Objects, SAS Base, SAS EG, Oracle 10g, PL/SQL, DB2, Shell Scripting, Clear Case, Clear Quest, HP Quality Center, Control-M, UNIX, AS400 and Win 2000
Confidential
Senior Consultant
Responsibilities:
- Developed complex mapping using Expression, Aggregate, Lookup, Sequence Generator, Update Strategy, Stored Procedure, Router and Dynamic parameters.
- Implemented slowly changing dimensions to maintain historical data using Type I Type II
- Scheduling and monitoring Informatica workflows and batches
- Responsible for supporting, troubleshooting, Data Quality and Enhancement Issues.
ENVIRONMENT: Informatica 7.1, Windows 2000, Oracle 9i, Erwin, Cognos Series 7 and Cognos Reportnet 1.1 Mr1
Confidential
Developer
Responsibilities:
- Responsible for complete life cycle of the system.
- Responsible client side requirement study, preparation of system specification, requirement specification.
- Responsible for creating application setup, installation, testing and troubleshooting
ENVIRONMENT: HP, MySQL, SQL, Java script, Linux, Apache
Confidential
Developer
Responsibilities:
- Cleaned up and rewrote much of the front end code to meet validation requirements
- Provide detailed project requirements, technical specifications as well as a step by step map for project execution
- Actively participated in all database related activity
ENVIRONMENT: PHP, MySQL, SQL, Java script, Linux, Apache