Hadoop Lead / Big data Lead Resume Phoenix, AZ - Hire IT People

PROFESSIONAL SUMMARY:

Around 8+ years of experience in IT industry includes Java developer and Big data consultant in Banking, Insurance and financial clients.
Having 3+ years of comprehensive experience as a Hadoop, Big Data consultant.
Experienced in processing Big Data on the Apache Hadoop framework using MapReduce programs.
Experienced in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions and AWS.
Experienced in using Spark.,Pig, Hive, Sqoop, Oozie, ZooKeeper, HBaseandCloudera Manager.
Imported and exported data usingSqoop from HDFS to RDBMS.
Experienced with Hadoop internals (MapReduce (YARN), HDFS), Streaming, Hcatalog.
Application development using Java, RDBMS, and Linux shell scripting,
Good Knowledge in relational/multidimensional databases and data modeling for OLAP/ROLAP, Soap, Agile, API.
Extended Hive and Pig core functionality by writing customUDFs.
Experienced in analysing data using HiveQL, Pig Latin, and custom Map Reduce programs in Java.
Experienced in job workflow scheduling and monitoring tools likeOozieand Zookeeper.
Experienced in designing, developing and implementing connectivity products that allow efficient exchange of data between the core database engine and the Hadoop ecosystem.
Expert level skills in developing intranet/internet application using JAVA/J2EE technologies which includes Struts framework, MVC design Patterns, Chrodiant, Servlets, JSP, JSLT, XML/XLST, JavaScript, AJAX, EJB, JDBC, JMS, JNDI, RDMS, SOAP,BI, Hibernate and custom tag Libraries.
Experience using XML, XSD and XSLT.
Experience in building analytics for structured and unstructured data and managing large data ingestion by using Kafka, Flume,Scala,Avro, Thrift and Sqoop.
Used Amazon Web Services (AWS) provides on - demand computing resources and services in the cloud computing infrastructure over the Internet with storage, bandwidth and customized support for application programming interfaces
The service is experienced in running large-scale networks

TECHNICAL SKILLS:

Big DataEcosystem: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Oozie and Flume,Accumulo

Programming Languages: C, C++, Java, SQL,PL/SQL,UNIX/Linux Shell Scripts

J2EE Technologies: JSP 2.1 Servlets 2.3, JDBC, JMS, JNDI,JAXP, Java Beans

Framework: JUnit, log4j, Spring, Hibernate

Database: Oracle, DB2,MySQL

Application Server: Apache Tomcat 5.x 6.0, Jboss 4.0

IDE s, Utilities & Web: Eclipse, NetBeans, SOAP UI, HTML,CSS, Java Script, Ajax, DTD Schemas, XSLT, XPath, DOM, XQuery

Operating Systems: Linux, MacOS, WINDOWS

Methodologies: Agile, UML, Design Patterns

PROFESSIONAL EXPERIENCE:

Confidential, Phoenix, AZ

Hadoop Lead / Big data Lead

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReducejobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Experienced in defining job flows.
Making thedataLakeavailable to hive tables for querying and analyzing insights by the users
Involved inTeradataQuery Tuning and tuned complex Queries, and Views, and implemented Macro's for reduce Parsing time.
Developed the application under J2EE architecture using Angular JS, Spring, Spring Security, Spring Batch, Spring MVC, Spring VI, Hibernate, core JavaBeans and Bootstrap.
Wrote ETL Informatica scripts to load the data inTeradataDW.
Testing the entire process which involvesdatavalidation like rows count,dataduplication and other test cases then repeating the same steps in Integration environment to check the system consistency.
Involved in Extraction ofDatalike identified, parsed, cleansed and integrated with other usefuldata. RDBMS.
Experienced in managing and reviewingHadooplog files.
Experience with Agile development processes and practices.
Extensively worked on Oozie and Unix scripts for batch processing and scheduling workflows dynamically. implemented spark solution to enable real time reports fromCassandradata.
PerformedCassandra for several key components like Data Model Review.
Worked withCassandraQuery Language (CQL) to execute queries on the data persisting in theCassandracluster.
Worked on tuning Bloom filters and configured compaction strategy based on the use case.
Experience inSparkStreaming to receive real time data and to store the stream data into HDFS .
Hands on experience inSparkin creating RDD’s, applying Transformations and Actions.
Creating hive table on parquet schema usingscala.
Responsible to manage data coming from different sources.
Developing business logic usingscala.
Responsible for the implementation of application system with core java /J2EE technologies.
Usedscalascripts for spark machine learning libraries API execution for decision trees, ALS, logistic and linear regressions algorithms.
Tuned Spark/Scalacode to improve the performance of machine learning algorithms for data analysis.
Performed calibration testing of customer equipment through use of theAirflowLab, engineering software, and system mock-ups.
Created mappings and sessions to implement technical enhancements fordatawarehouseby extractingdatafrom sources like Oracle and Delimited Flat files.
Prepared various mappings to load thedatainto different stages like Landing, Staging and Target tables.
Created Django dashboard with custom look and feel for end user after a careful study of Django admin site and dashboard.
Python Unit test library was used for the purpose of testing many programs on Python and other codes.
Compared leases using inner/outer joins that are active to our total lease database to ensure data integrity and validity.
Developed several Python administrative scripts to automate project deployment process.
Load and transform large sets of structured, semi structured and unstructured data.
Responsible to manage data coming from different sources.
Configuring Spark Streaming to receive real time data from the Kafka and Store the stream data to HDFS.
Monitored workload, job performance and capacity planning usingClouderaManager.
Hands on experience in installation, configuration, supporting and managing 50+node Clusters using Apache, Horton works on MapRandClouderaManager.
Responsible for implementing MongoDBto store and Kafkaanalyze unstructured data.
Supported Map Reduce Programs those are running on the cluster.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
Testing and production support of core javabased multithreading ETL tool for distributed loading XML data into Oracle11g database using JPA/Hibernate.
Implemented CDH3Hadoop cluster on CentOS.
Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
Created HBase tables to store variable data formats of PII data coming from different portfolios.
Implemented best income logic using Hive and Pig scripts.
Migrated Complex map reduce programs into in memorysparkprocessing using Transformations and actions.
Usedto store streaming data to HDFS and to implementSparkfor faster processing of data.
Created Java basedScalarefiners to replace existing SQL Stored Procedures.
Experience in developing data pipeline using Kafka and Storm to store data into HDFS.
AWS provides a secure global infrastructure, plus a range of features that use to secure the data in the cloud
Implemented AWS provides a variety of computing and networking services to meet the needs of applications
Designs high quality software tools that are secure, scalable, and reliable.
Designs and documents tasks required for installations, configurations, upgrades and testing.
Data copy between production and lower environments.
System Automation programming usingPerl,Bash, shell scripting.

Environment: Hadoop, MapReduce, HDFS, Hive, SQL, PIG, AWS, core Java, Zookeeper, MongoDB, CentOS, Cloudera Manager, Pig, Sqoop, Oozie, ZooKeeper, MySQL, Windows, HBase,SOLR, Java.

Confidential, Peoria, IL

Hadoop Developer / Big data Developer

Responsibilities:

Installed and configured HadoopMapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and pre-processing.
Importing and exporting data into HDFS and Hive using Sqoop.
ImplementedKerberossecurity for various Hadoop services using Cloudera Manager.
Hands-on configuring various Hadoop services and testing it in a production environment.
Added authorization to the server using the user’sKerberosidentity to determine which role each was and which operations they could perform.
Used Multithreading, synchronization, caching and memory management.
Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
Load and transform large sets of structured, semi structured and unstructured data.
Supported Map Reduce Programs those are running on the cluster.
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs.
Built Big Data solutions using HBasehandling multiple records.
Created HBase tables to store variable data formats from millions of data rows.
Developed data pipeline using Flume and Java map reduce to ingest employee browsing data into Hbase/HDFS for analysis.
Utilized Java and MySQL from day to day to debug and fix issues with client processes. Managed and reviewed log files.
For the manipulation of data from the database various queries using SQL were written and created a database using MySQL.
Extracteddatafrom oracle database and spreadsheets and staged into a single place and applied business logic to load them in the central oracle database.
Used Informatica Power Center 9.5 for extraction, transformation and load (ETL) ofdatain thedatawarehouse.
Extensively used Transformations like Router, Aggregator, Normalizer, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure
Implemented partitioning, dynamic partitions and buckets in pig and HIVE
Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns.
Created internal and external Hive tables and defined static and dynamic partitions for optimized performance.
Created struts-config.xml file for the Action Servlet to extract the data from specified Action form so as to send it to specified instance of action class.
Implemented Spark with data pipeline by chaining multiple mappers by using Chained Mapper.
Created Hive Dynamic partitions to load time series data
Experienced in handling different types of joins in Scala like Map joins, bucker map joins, sorted bucket map joins. implemented different machine learning techniques inScala and usingmachine learning library.
Good knowledge in runningHadoopstreaming jobs to process terabytes of xml form data.
Involved in managing Virtual RedHatLinuxservers running on VMWare ESX 4/5.
Working knowledge in creating Stored Procedures, Triggers, User-Defined Functions, Views, Indexes, User Profiles, Analytical Functions using T-SQL, SQL Server, PL/SQL.
Developed queries via joining various tables to validate the data coming with different data discrepancies using existing fields and quantifying results with calculations.
Worked with QA lead/managers to designing automation testing big data jobs.
Migrated Long runningHadoopjobs toEMR

Environment: Hadoop, MapReduce, HDFS, Hive, CouchDB, Flume, Tomcat 6. Maven, SQL language, Oracle, XML, Eclipse.

Confidential, Mayfield Village, OH

Java Developer

Responsibilities:

Involved in review of functional and non-functional requirements.
Involved in the development of HTML pages, JSPs for different User Interfaces.
Designed and implemented GUI module using JSPs and Struts framework.
Implemented and design patterns like MVC, Factory and Singleton
Implemented AWSclient API to interact with different services as Console configuration for AWS EC2.
Implemented overall logging strategy for the project using Log4J.
Used hibernate as persistence framework. Developed hibernate mapping file
Involved in Bug fixing of various modules that were raised by the Testing teams in the application during the Integration testing phase.
Have created highly fault tolerant, highly scalable Javaapplication using AWSElastic Load Balancing, Ec2 VPC and S3 as part of process improvements.
Facilitated knowledge transfer sessions.

Environment: Java 1.6, Eclipse Indigo, Jboss 5.0, Oracle, JSP, Struts 2.0, AWS, JQuery, Maven, JUnit 4, Log4J, Visio, TOAD, SVN, Unix, Hibernate 3.2.1

Confidential, San Diego, CA

Java Developer

Responsibilities:

Analyzed System requirements and designed Use Case Diagrams from requirement specifications
Database design using data modeling techniques and Server side coding using Java
Developed JSPs for displaying shopping cart contents and to add, modify, save and delete cart items
Implemented Online shopping module using EJBs with the business logic implemented as per persistence requirements of data model using Session and Entity Beans according to EJB specifications
Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
Designed user-interface and checking validations using JavaScript.
Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
Developed various EJBs for handling business logic and data manipulations from database.

Environment: J2EE, Java/JDK, JDBC, JSP, Servlets, JavaScript, EJB, JNDI, JavaBeans, XML, XSLT, Oracle 9i, Eclipse, HTML/ DHTML, SVN.

We provide IT Staff Augmentation Services!

Hadoop Lead / Big Data Lead Resume

Phoenix, AZ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship