Hadoop/Spark Developer Resume ,Â Plano, TX - Hire IT People

SUMMARY:

7 years of professional experience in Requirements Analysis, Design, Development and Implementation of Java, J2EE and Big Data technologies.
4+ years of exclusive experience in Big Data technologies and Hadoop ecosystem components like Spark, MapReduce, Hive, Pig, YARN, HDFS, Sqoop, Flume, Kafka and NoSQL systems like HBase, Cassandra.
Strong Knowledge on Architecture of Distributed systems and Parallel processing, In - depth understanding of MapReduce Framework and Spark execution framework.
Expertise in writing end to end Data Processing Jobs to analyze data using MapReduce, Spark and Hive.
Extensive experience in working with structured data using Hive QL, join operations, writing custom UDF's and experienced in optimizing Hive Queries.
Experience using various Hadoop Distributions (Cloudera, Hortonworks, Amazon AWS) to fully implement and leverage new Hadoop features.
Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
Extensive experience in importing/exporting data from/to RDBMS the Hadoop Ecosystem using Apache Sqoop.
Worked on Java HBase API for ingestion processed data to HBase tables
Strong experience in working with UNIX/LINUX environments, writing shell scripts.
Good knowledge and experience of Real time streaming technologies Spark and Kafka.
Experience in optimization of MapReduce algorithm using Combiners and Practitioners' to deliver the best results.
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
Extensive experiences in working with semi/unstructured data by implementing complex MapReduce programs using design patterns.
Sound knowledge of J2EE architecture, design patterns, objects modeling using various J2EE technologies and frameworks.
Adept at creating Unified Modeling Language (UML) diagrams such as Use Case diagrams, Activity diagrams, Class diagrams and Sequence diagrams using Rational Rose and Microsoft Visio.
Extensive experience in developing applications using Java, JSP, Servlets, JavaBeans, JSTL, JSP Custom Tag Libraries, JDBC, JNDI, SQL, AJAX, JavaScript and XML.
Experienced in using Agile methodologies including extreme programming, SCRUM and Test Driven Development (TDD).
Proficient in integrating and configuring the Object-Relation Mapping tool, Hibernate in J2EE applications and other open source frameworks like Struts and Spring.
Experience in building and deploying web applications in multiple applications servers and middleware platforms including Web logic, Web sphere, Apache Tomcat, JBoss.
Experience in writing test cases in Java Environment using JUnit.
Hands on experience in development of logging standards and mechanism based on Log4j.
Experience in building, deploying and integrating applications with ANT, Maven.
Good knowledge in Web Services, SOAP programming, WSDL, and XML parsers like SAX, DOM, AngularJS, Responsive design/Bootstrap.
Demonstrated technical expertise, organization and client service skills in various projects undertaken.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Oozie, Sqoop, Zookeeper, YARN, TEZ, Flume, Spark, Kafka

Java&J2EE Technologies: Core Java, Servlets, JSP, JDBC, JNDI and Java Beans

Databases: Teradata, Oracle11g/10g, MySQL, DB2, SQL Server, NoSQL (Hbase, MongoDB)

Web Technologies: JavaScript, AJAX, HTML, XML and CSS.

Programming Languages: Java, JQuery, Scala, Python, UNIX Shell Scripting

IDE: Eclipse, NetBeans, pyCharms

Integration & Security: MuleSoft, Oracle IDM & OAM, SAML, EDI, EAI,

Build Management tools: Maven, Apache ANT, SOAP, REST

Predictive Modelling Tools: SAS Editor, SAS Enterprise guide, SAS Miner, IBM Cognos.

Scheduling Tools: Cron tab, Autosys, Ctrl M

Visualization Tools: Tableau, Arcadia Data.

PROFESSIONAL EXPERIENCE:

Confidential, Plano, TX

Hadoop/Spark Developer

Responsibilities:

Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Spark, Impala, Cassandra with Hortonworks Distribution.
Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre - processing.
Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and Hbase.
Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
Import the data from different sources like HDFS/Hbase into Spark RDD.
POC on Single Member Debug on Hive/Hbase and Spark.
Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
Load the data into Spark RDD and do in memory data Computation to generate the Output response.
Loading Data into Hbase using Bulk Load and Non-bulk load.
Experience in Oozie and workflow scheduler to manage hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
Expertise in different data Modeling and Data Warehouse design and development.

Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, Kfaka, Solr, HBase, Oozie, Flume, Spark - Streaming/SQL, java, SQL Scripting, Linux Shell Scripting.

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:

Installed and configured Hadoop Environment.
Developed multiple Map - Reduce jobs in java for data cleaning and preprocessing.
Installed and configured Pig and also written Pig Latin scripts.
Used pig and map reduce to analyze XML files and log files.
Imported data using Sqoop to load data from IBM DB2 to HDFS on regular basis.
Written Hive queries for data analysis to meet the business requirements.
Creating Hive tables and working on them using Hive QL.
Importing and exporting data into HDFS and Hive using Sqoop from IBM DB2, Netezza Databases.
Used Oozie workflow to co-ordinate pig and hive scripts.
Used Impala for querying HDFS data to achieve better performance.
Designed and implemented Map-Reduce based large-scale parallel relation-learning system.
Setup and benchmarked Hadoop/Hbase clusters for internal use.
Developed UDF's to pre-process the data and compute various metrics for reporting in both pig and hive.
Developed Map Reduce program to convert mainframe fixed length data to delimited data.
Data ingestion from various IBM DB2 tables to HDFS using Sqoop.
Automated Python scripts to pull and synchronize the code in GitHub environment.

Environment: Hadoop, CDH, Map Reduce, HDFS, Pig, Hive, Oozie, Java, UNIX, Flume, Impala, Hbase, Oracle, Map R AutoSys, Mainframes, JCL, IBM DB2, NDM.

Confidential, Columbus, OH

Java/Hadoop Developer

Responsibilities:

Responsible for business logic using java and JavaScript, JDBC for querying database.
Involved in requirement analysis, design, coding and implementation.
Worked in Agile Methodology and used JIRA for maintain the stories about project.
Analyzed large data sets by running Hive queries.
Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive.
Handled importing and exporting data into HDFS by developing solutions, analyzed the data using Map Reduce, Hive and produce summary results from Hadoop to downstream systems.
Used Sqoop to import and export the data from Hadoop Distributed File System (HDFS) to RDBMS.
Created Hive tables and loaded data from HDFS to Hive tables as per the requirement.
Established custom Map Reduces programs in order to analyze data and used HQL queries to clean unwanted data.
Created components like Hive UDFs for missing functionality in Hive to analyze and process the large volumes of data.
Worked on various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.
Involved in writing complex queries to perform join operations between multiple tables.
Involved actively verifying and testing data in HDFS and Hive tables while Sqooping data from Hive to RDBMS tables.
Developing Scripts and Scheduled Autosy's Jobs to filter the data.
Involved monitoring Auto Sys's file watcher jobs and testing data for each transaction and verified data weather it ran properly or not.
Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
Used IMPALA to pull the data from Hive tables.
Used Apache Maven 3.x to build and deploy application to various environments Installed Oozie workflow engine to run multiple Hive jobs which run independently with time and data availabilities

Environment: HDFS, Hadoop, Pig, Hive, Sqoop, Flume, Map Reduce, Oozie, Mongo DB, Java 6/7, Oracle 10g, Sub Version, Toad, UNIX Shell Scripting, SOAP, REST services, Oracle 10g, Agile Methodology, JIRA, Auto Sys

Confidential, Houston, TX

Java Developer

Responsibilities:

Involved in Requirements and Analysis Understanding the requirements of the client and the flow of the application as well as the application Framework.
Involved in designing, developing and testing of J2EE components like Java Beans, Java, XML, Collection Framework, JSP, Servlets, JMS, JDBC, and deployments in WebLogic Server.
Effectively developed Action classes, Action forms, JSP, JSF and other confs files like struts - config.xml, web.xml.
Used Eclipse as Java IDE tool for creating various J2EE artifacts like Servlets, JSP's and XML.
Developed interactive and dynamic web pages using hand coded semantic HTML5, CSS3, JavaScript, Bootstrap.
Designed dynamic client-side JavaScript codes to build web forms and simulate process for web application, page navigation and form validation.
Implemented back-end code using Spring MVC framework that handles application logic and makes calls to business objects.
Developed REST Web services using JAX-RS and Jersey to perform transactions from front end to our backend applications, response is sent in JSON format based on the use cases.
Used Spring, Hibernate module as an Object Relational Mapping tool for back end Operations over SQL database. Used Maven and Jenkins for building and deploying the application on the servers
Provided Hibernate mapping files for mapping java objects with database tables.
Database development required creation of new tables, PL/SQL stored procedures, functions, views, indexes and constraints, triggers and required SQL tuning to reduce the response time in the application.
Created REST Web Services using Jersey to be consumed by other partner applications.
Worked in a fast-paced AGILE development environment while supporting requirements changes and clarifications. Design and work complex application solutions by following Sprint deliverables schedule.
Used Log4j for Logging various levels of information like error, info, debug into the log files.

Environment: Core Java, J2EE, Spring, Hibernate, Oracle, HTML, CSS, XML, JavaScript, JQuery, AJAX, Angular.JS, Bootstrap, Web logic, JUnit, RESTful Web Services, Agile Methodology, Maven, GIT, Eclipse

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

Plano, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship