We provide IT Staff Augmentation Services!

Big Data Architect Resume

0/5 (Submit Your Rating)

Overland Park, Ks

PROFESSIONAL SUMMARY:

  • 8.7 years of experience in Big Data Solution Design, Data Mining, Data warehousing and Business intelligence (BI) Applications, Business Requirement Analysis, Designing, Application Development, Product Configuration, Maintenance Operations, & Team Management in Banking, Health Insurance and Manufacturing domain.
  • Good knowledge of Hadoop ecosystem (Map Reduce, Pig, Hive, Hbase, Oozie, Zookeeper, Sqoop, Flume, Hue and Yarn).
  • Good understanding of Machine Learning algorithms and tool sets (including Mahout and R)
  • Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
  • Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture, data modelling and advanced data processing.
  • Exposure to Amazon Web Services - AWS cloud computing (EMR, EC2 and S3 services).
  • Have done performance tuning and optimization on Hadoop cluster.
  • Experience in single node and multi node cluster configuration and Testing.
  • Experience in analyzing data using Hive QL, PIG Latin, and custom MapReduce programs in Java.
  • Excellent knowledge on MySQL, Oracle, DB2 and RDBMS.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
  • Exposure to Cloudera development environment and management using Cloudera Manager.
  • Exposure to Hortonworks development environment.
  • Hands on experience with NoSQL's HBase, Cassandra and MarkLogic.
  • Exposure to Java Eclipse environment for writing Map Reduce jobs, testing, debugging and creation of .jar files.
  • Experience in using Data Warehouse Administration Console (DAC) for scheduling and executing Incremental ETL loads.
  • Knowledge of implementing Informatica recommended best practices and ETL Error Handling techniques.
  • Experience in creating Workflows, Worklets, Mappings, Mapplets, Reusable transformationsand schedulingWorkflows and sessionsusingInformatica Power Center.
  • Designed and developed complex mappings, from varied transformation logic like Unconnected and Connected lookups, Router, Filter, Expression, Aggregator, Joiner Update Strategy and more.
  • Expertise in Software Development Life Cycle (SDLC) of Projects - Requirement gathering, Analysis, Design, Coding, Testing and implementing business applications
  • Achieved Oracle Certified Associate, Well versed in PL/SQL programming.
  • Working knowledge of .Net programming, creating msi installers, SQL Server 2005, DB2 and Java Programming.
  • Have undergone client sponsored for Share Point Admin and Nintex Workflow.
  • Adept in analysing information system needs, evaluating end-user requirements, custom designing solutions, troubleshooting for complex information systems management.
  • Experience in producing detailed technical specifications.
  • Have won GEMS and Accolades for giving best performance, innovation at work and achieving the difficult deadlines before time.
  • Experienced in environments requiring direct customer interaction during specifications, design, and development and product implementation phases.
  • Proficient in handling Onsite-Offshore business model.

TECHNICAL SKILLS:

Hadoop / BigData: Hadoop File System, MapReduce, Pig, Hive, Hue, Oozie, Zookeeper, Sqoop, Flume, YARN

No SQL: Hbase, Cassandra, MarkLogic

Data Scientist Tool box: R (Caret package), Weka

Cloud computing: Exposure to Amazon Web Services (AWS - EMR, EC2 and S3 services).

BI Tools: OBIEE (10G, 11G)

ETL/Scheduling Tool: Informatica Power center 8.x and 9.x, DAC, Zeke

Language: Oracle SQL, PL/SQL, C#.Net, JAVA.

Database platforms: Oracle 10g/9i, MS SQL Server, MY SQL, DB2, MS Access.

Database Tools: SQL*Loader, SQL Developer, Toad, SQL PLUS, AQT

Change Management Tools: VSS (Visual Source Safe), Git, SVN Subversion

Operating Systems: Windows XP/2000/2003/Vista, Windows 7, UNIX (AIX/ Solaris)

Web Technology: HTML, XML, JAVA Script, JSP

OS/Environment: Windows 95/98/2000/2003/ XP/2007, Linux (Ubuntu).

Project Management Tools: MS Project, MS SharePoint, Nintex Workflow, MS-Office

PROFESSIONAL EXPERIENCE:

Confidential, Overland Park KS.

Big Data Architect

Responsibilities:

  • Analysed and understood existing EDW and the replorting needs of the end users.
  • Was involved in designing the hybrid architecture using RDMS for KPI related needs and Hadoop for data storage and analytics.
  • Hive is being used as an EDW and HBase is being used for maintaining control tables.
  • Oozie will be used as a scheduler to perform the necessary actions one after other.
  • Working on development and ETL strategy for External and internal ORC tables.
  • Planning on data retention and migration strategies.

Confidential, Hartford CT

Big Data Consultant

Responsibilities:

  • Analysed and understood existing DB2 system and created the requirement document for client’s approval.
  • Proposed an automated system using Shell script to sqoop the job.
  • Worked in Agile development approach.
  • Created the estimates and defined the Confidential stages.
  • For Data Ingestion from DB2 to Hadoop, designed and developed a Data Lake and Sqoop jobs.
  • Developed the shell scripts for Data Migration from DB2 to Hadoop Data Lake.
  • Developed a strategy for Full load and incremental load using Sqoop.
  • Designed External tables in Hive to be created on top of each Sqoop result.
  • Provided an option for single mapper load and multiple mapper load to effectively utilize parallelism.
  • Developed strategy for Zeke schedules which will run the scripts at defined intervals.
  • Created a detailed unit test plan for all the possible scenarios which was highly appreciated.
  • Did a complete Load and Stress testing of the production data in Hadoop environment.
  • Assisted in documentation and guided the team members
  • Involved in building a framework for analysing the real time data processing using Marklogic.

Confidential, Auburn Hills MI.

Hadoop Consultant, Big Data Solution Architect.

Responsibilities:

  • Have designed and build robust Hadoop solutions for structured, semi structured and unstructured data.
  • Involved in full lifecycle for the Big Data solution including requirements analysis, technical architecture design, solution design, solution development, testing & deployment.
  • Created architecture for successful implementation of with Hadoop technology stack (HDFS, MapReduce, Hive, Hbase, Sqoop).
  • Writing Java programs for User defined Functions and some low level Map Reduce programs.
  • Writing Pig Scripts for data validation, pre-processing and data loading into HDFS.
  • Writing Hive programs for data analysis on HDFS data.
  • Heavily use Sqoop for data migration to and from HDFS and RDBMS systems.
  • Extensive data validation/data pipe line using Pig, also created reusable PIG UDF’s.
  • Created Oozie Workflows for loading data in HDFS, running Pig scripts for ETL on the data loaded in HDFS.
  • Used Weka and R for Machine learning algorithms. (Supervised and Unsupervised Learning).
  • Classification and clustering of data for mining the required information to business.
  • Understanding the Samples, Min, Max, Mean over the processed data and plot Histograms for cumulative counts.
  • Documenting the business needs into functional needs.
  • Preparing Production release notes.
  • Assisted in creation of Integration Test Plans.
  • Suggesting a correct technology along with proper analysis on the phase of implementation.
  • Worked on “Proof of Concepts” using the suggested technologies so as to ensure that the suggested technologies are viable and provides value to client.
  • Performed a “Proof of Value” on No-SQL technologies (HBase, Cassandra and MarkLogic) and gave a comparison report of the findings to higher management.
  • Did proof of concepts on Amazon EMR service and Microsoft Azure HDInsight to check feasibility to perform MR tasks over cloud.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Cluster co-ordination through Zookeeper.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.

Confidential, Troy, MI

Business Intelligence Consultant

Responsibilities:

  • Primary responsibilities include gathering business requirements, designing the functional and technical specifications.
  • Designed and Tested Reports in theDashboards created the Ad-hocreport according to the client needs.
  • Migrated existing OBIEE reports from the 10g reporting environment to the 11g environment.
  • ConfiguredInteractive Dashboardwithdrill-downcapabilities usingglobalandlocal Filters,Metadata ObjectsandWeb Catalog Objects.
  • AppliedCascading PromptsandMulti Level PromptsonDashboards.
  • Identifiedissues,performancebottlenecks, and optimized the Business IntelligenceDashboardsand Reports.
  • Developed different kinds ofreports (pivots, charts, tabular)usingglobalandlocal Filters.
  • WroteCustom SQL Codesfor the reports.
  • Involved inMonitoringandSchedulingdifferent process flows usingDCA controller.
  • TransformandLoaddata fromFlat filesintotarget tables.
  • Used Debugger tovalidate Mappingsby creating break points toanalyzeandmonitor Data flow.
  • Involved introubleshootingtheload failure cases, includingdatabase problems.

Confidential

Sr. ETL developer, Technical Lead

Responsibilities:

  • ETL Process Designing and Development.
  • Involved in the Extraction, Transformation and Loading of data (ETL Process) from multiple sources to target using Informatica Power Center software (Repository Manager, Designer, Server Manager)
  • Managing Analysis, Design, Coding, Unit Testing and UAT
  • Developed standard and reusable mappings and mapplets using various transformations like Expression, Aggregator, Joiner, Router, Lookup (Connected and Unconnected) and Filter
  • Preparing Design Document, Functional specification and process flow diagrams.
  • Standardize Power Center components naming convention, guidelines, and best practices.
  • Review all deliverables documentation before submittal and participate in quality reviews.
  • Analysis of the Source tables, columns and creating Target table definitions.
  • Worked with the Development team for development / changes required for the System. Analyzed the changes by working directly with the Business user in understanding the requirements for the changes.
  • Exception handling during loads and sending valid exceptions to users for verification.
  • To perform environmental validations and make sure all the required database objects are in valid state.
  • Coordinate with cross functional teams for EDW and ensure successful completion of End to End System Testing.
  • Coordinate migration of code from Development to Complete phases of Testing in Test environment, and then from Test to Production environment.
  • Understanding the defect reporting conventions, the several details related to priority, routing, cross-project functionality and the defect database.
  • Provide support of systems after implementation by Defect tracking, resolution and root cause analysis using QUALITY CENTER.

Confidential

Sr. ETL Developer

Responsibilities:

  • Writing SQL, PLSQL, SQL*Plus programs required to retrieve data from operational data sources using cursors and exception handling.
  • Creating different packages, procedures and functions using PL/SQL and made them generic for later use.
  • Developing views, sequences, synonyms, using SQL to transform data.
  • As an ETL developer responsibilities included requirements gathering, designing, developing mappings, workflows, shell scripting, building Oracle procedures and packages, unit testing, job scheduling, migration and support of ETL applications.
  • Designed and developed mappings, tasks, workflows & shell scripts for EDW and Downstream applications. Introduced Session partitioning and other performance tuning mechanisms to improve the efficiency of ETL’s.
  • Designed and developed auto validation emails and error handling mechanisms to automate the load process.
  • Validated the data to ensure the integrity of the information to be interfaced.
  • Preparing Unit Test cases, test data and verification of Unit test Results
  • Preparing Production release notes and user documentation

Confidential

Application Developer

Responsibilities:

  • Webpage development using JSP as front end, MYSQL as back end and Apache Tomcat server.
  • Preparing screen layouts and functional requirement specifications.
  • Preparing Unit Test cases, test data and verification of Unit test Results.

Confidential

Developer/Team lead

Responsibilities:

  • Understanding the business needs
  • Documenting the requirements
  • Writing the Java code to create a desktop application for entering the data
  • Writing the queries to store the result in database.
  • Test plan creation and testing
  • Getting approval from guide on various modules and use cases.
  • Deployment on user machines.
  • Preparing release notes and user documentation

We'd love your feedback!