Hadoop Consultant/ Big Data Analyst Resume
Chicago, IL
SUMMARY:
- 7+ years of professional experience in IT, including 3 + years of work experience in Hadoop Eco system.
- Proficient in the area of project implementation (SDLC )specifically in integration of business intelligence strategy, requirement gathering, requirement analysis, data modeling, information processing, system design, testing and training.
- Possess experience in conducting currentstate (as - is) system analysis, defining future state (to-be), eliciting requirements, developing functional or technical requirements, mapping business processes to application capabilities, conducting fit gap analysis, developing/configuring/prototyping solutions, implementing to-be processes / solutions based on application capability and industry best practices.
- In depth knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and YARN / MapReduce programming paradigm and for working with Big Data to analyze large data sets efficiently.
- Hands on experience in working with Ecosystems like Hive, Pig, Sqoop, Map Reduce, and Flume.
- Hands-on experience with related/complementary open source software platforms and languages (e.g. Java, Linux, UNIX, Python).
- Experience in importing and exporting terra bytes of data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Experience in analyzing data using HiveQL, Pig Latin, and custom MapReduce programs in Java.
- Hands On experience in NoSQL database like HBASE. Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper
- Experience working on processing data using Pig and Hive. Involved in creating Hive tables, data loading and writing hive queries.
- Experience in NoSQL database MongoDB and Cassandra.
- Followed Test driven development of Agile and scrum Methodology to produce high quality software
- Experience working in the Cloud environment like Amazon Web Services (AWS)
- Good knowledge on integrating the BI tools like Talend with the Hadoop stack and extracting the required Data.
- Experience in working with the Different file formats like TEXTFILE, AVRO and JSON.
- Good knowledge on Hadoop administration activities such as installation and configuration of clusters using Apache and Cloudera
- Involved in various projects related to Data Modeling, System/Data Analysis, Design and Development for Data warehousing environments. Strong knowledge on ETL methods, Developed mapping spreadsheets for (ETL) team with source to target data mapping with physical naming standards, data types, volumetric, domain definitions, and corporate meta-data definitions.
- Established and maintained comprehensive data model documentation including detailed descriptions of business entities, attributes, and data relationships.
- Comprehensive knowledge and experience in process improvement, normalization/de-normalization, data extraction, data cleansing, data manipulation, performance tuning. Scheduling PL/SQL jobs in Toad.
- Worked on Database and ETL Testing - Functions, Stored procedures, packages, Constraints, Loading data into tables, Executing scripts. Performed Test case preparation/execution and Defect Management as well.
- Experience in Object Oriented Analysis and Design, Java/J2EE technologies including HTML,XML.
- Possess experience with working on large projects/teams and the ability to mentor junior developers as a good team player, quick learner as have worked in different teams.
- Demonstrated success many times under aggressive project schedules and deadlines, flexible, result oriented and adapts to the environment to meet the goals of the product and the organization
- Excellent work ethics, self-motivated, quick learner and team oriented. Continually provided value added services to the clients through thoughtful experience and excellent communication skills
TECHNICAL SKILLS:
Operating Systems: Windows NT/XP/2003, UNIX, Linux
Hardware/OS: CMTS, Cable Modem, ATCA, CP/BTS/Huawei/Cantata Switch, RoutersValidation tools: AirPcaP, MGSOFT, Iperf, Wireshark, Nagios, BACC
Databases: MS SQL Server 2008, Hive (NoSQL), MongoDB, Cassandra
Schema: Avro, JSPN and XML
Hadoop echo Systems: Cloudera Manager/Apache Hadoop HDFS, NFS, Map reduce, Hbase, Pig, Hive, Sqoop, Flume, Oozie, AMP, Hue, Hadoop Tokenization, Nagios, Ganglia, Splunk
Defect Tracking Tool: HP Quality Center, DTS, JIRA, Clear case
Protocols: TCP/IP, HTTP/HTTPS,SNMP, 802.11a/b/g/n, DOCSIS 3.0 and Packet cable 2.0
Project Management: MPP, IPM+, Mi Time, PBS,ALCON,PMP Framework, staffing, Appraisals
Quality and Process: SDLC, CMMI, Milestone report, Project Plans, Metric report, Defect Prevention processes, PCB and SPC
PROFESSIONAL EXPERIENCE:
Confidential, Chicago, IL
Hadoop Consultant/ Big Data Analyst
Responsibilities:
- Worked on Python, bash and Pig scripts to transform and cleanup probe GPS data into HDFS.
- Scheduled workflow jobs using Apache Oozie.
- Queried and imported probe data to Hbase and extracted data from Hbase to HDFS .
- Successfully integrated Hive tables and Mongo DB collections and developed web service that queries Mongo DB collection and gives required data to web UI.
- Developed multiple MapReduce jobs to analyze GPS data using clustering algorithm and map matching process
- Performed machine learning and built predictive model using R and SAS Enterprise.
- Mentored BI analysts and other data analysts querying data with Hive.
Environment: Windows 7, Python3.0, SAS 9.3, R, Eclipse, Java 7, JDBC, Mongo db, Oracle Spatial; Cloudera CDH4, Linux, UNIX Shell, Pig, HDFS, Oozie, Hive, Hbase, MapReduce, Sqoop.
Confidential, Boston, MA
HadoopDeveloper
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in defining job flows.
- Experienced in managing and reviewing Hadoop log files.
- Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
- Extracted and updated the data into MangoDB using Mongo import and export command line utility interface.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Got good experience with NoSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
Environment: Java 6, Eclipse, Linux, Hadoop, HBase, Sqoop, Pig, Hive, mapreduce, Cloudera CDH4,Mongo db, Flume.
Confidential, LosAngeles, CA
HadoopDeveloper
Responsibilities:
- Worked extensively in creating MapReduce jobs to power data for search and aggregation.
- Designed a data warehouse using Hive
- Worked extensively with Sqoop for importing metadata from Oracle
- Extensively used Pig for data cleansing
- Created partitioned tables in Hive
- Worked with business teams and created Hive queries for ad hoc access.
- Evaluated usage of Oozie for Workflow Orchestration
- Mentored analyst and test team for writing Hive Queries
- Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Hortonworks, Oozie, Oracle 11g/10g.
Confidential, IL
Java Programmer
Responsibilities:
- Designed the framework using Struts1.2 and other J2EE design patterns for the system, this includes coding business components and interfaces to be used by the team for system development.
- Used Spring as an integrated framework and implemented IOC concept .
- Used Hibernate as ORM tool for data persistence.
- Studied the business processes and defining the requirements.
- Designed and developed Struts Action classes, Action Forms, Spring DAO, Views using Struts custom tags.
- Used RAD as Java IDE tool for creating JSP, Servlets , and XML and deployed the application on WebSphere 6.1 .
- Developed server-side common utilities for the application and the front-end dynamic web pages using JSP, JavaScript and HTML/DHTML.
- Involved in Configuring web.xml and struts-config.xml according to the struts framework.
- Used Clear Case as source control.
- Used the ANT Scripts for building the entire web application.
- Performed Unit Testing , Integration Testing , and System Testing .
Environment: Java, J2EE, Struts, AJAX, Hibernate, Spring, Rational Rose, XML - Parser (DOM), XLST, PERL, JSP, HTML, CSS, JavaScript, JSTL, Eclipse, WebSphere6.1, SQL Server 2005, Log4j, Windows XP and UNIX.
Confidential
Java Developer
Responsibilities:
- Used Eclipse 3.2 as IDE, Tomcat 5.x as a web server andWeblogic 9.x as an application server, and NetEzza as the database to develop and deploy the application.
- Wrote code for basic Basic java Interfaces for implementing business logic.
- Involved in design and development of Servlets and JSPs using Apache Struts framework.
- Used Spring framework to handle some of the requests to the application.
- Used JDBC, Data Sources and Connection Pooling in Application server to interact with the NetEzza.
- Used JSF and Facletsfor UI component Binding.
- Developed few Factory Classes which act as controllers and diverts the HTTP Request to a particular Request Handler Class based on the Request Identification Key.
- Implemented J2EE Design Patterns such as Session Facade to reduce the Network Traffic and Service Locator.
- Used Jasper Reports for report generation.
- Designed and developed a user usage logging facility using Apache Log4J 1.2.8, used different Levels of Loggers Such as INFO, DEBUG, WARN.
- Was involved in shell scripting for application automation such as auto start, auto stop, and periodical backup.
- Involved in Export/Import of data using Data Transformation Services (DTS). Imported data from flat files to various databases and vice-versa.
- Worked and Modified the Database Schema according to the Client requirement
- Implemented Complete client side validations in JavaScript.
- PVCS (Serena) is used for the version control and maintenance. Also partially used SVN
- Used ANT to write build scripts as well as deployment scripts.
- Involved in the system study, preparation of Data Flow Diagrams and Entity Relationship Diagrams.
- Packed and deployed the entire application code to integration testing environment for all the releases.
- Implemented using Extreme Programming in Coding. Programmers followed all the standards in the coding
- Involved in JUNIT tests for the services and documented the services developed.
- Provided production support by interacting with the end-users and fixing bugs.
- As an Analyst was involved with interacting with the clients and application user for their Requirements, Specifications and enhancements.
Environment: Java, J2EE, Struts, Spring, Eclipse LINUX, SOA, JSP/Servlets, CSS, Tomcat, WebLogic 9.x, JDBC, XML, HTML, Oracle 10g, UML, JUNIT, PVCS, SVN, ANT 1.3/1.4, SOAP, Web Services.