We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

4.00/5 (Submit Your Rating)

Plano, TX

SUMMARY:

  • 7 years of professional experience in Requirements Analysis, Design, Development and Implementation of Java, J2EE and Big Data technologies. 
  • 4+ years of exclusive experience in Big Data technologies and Hadoop ecosystem components like Spark, MapReduce, Hive, Pig, YARN, HDFS, Sqoop, Flume, Kafka and NoSQL systems like HBase, Cassandra. 
  • Strong Knowledge on Architecture of Distributed systems and Parallel processing, In - depth understanding of MapReduce Framework and Spark execution framework. 
  • Expertise in writing end to end Data Processing Jobs to analyze data using MapReduce, Spark and Hive. 
  • Extensive experience in working with structured data using Hive QL, join operations, writing custom UDF's and experienced in optimizing Hive Queries. 
  • Experience using various Hadoop Distributions (Cloudera, Hortonworks, Amazon AWS) to fully implement and leverage new Hadoop features. 
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data. 
  • Extensive experience in importing/exporting data from/to RDBMS the Hadoop Ecosystem using Apache Sqoop. 
  • Worked on Java HBase API for ingestion processed data to HBase tables 
  • Strong experience in working with UNIX/LINUX environments, writing shell scripts. 
  • Good knowledge and experience of Real time streaming technologies Spark and Kafka. 
  • Experience in optimization of MapReduce algorithm using Combiners and Practitioners' to deliver the best results. 
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance. 
  • Extensive experiences in working with semi/unstructured data by implementing complex MapReduce programs using design patterns. 
  • Sound knowledge of J2EE architecture, design patterns, objects modeling using various J2EE technologies and frameworks. 
  • Adept at creating Unified Modeling Language (UML) diagrams such as Use Case diagrams, Activity diagrams, Class diagrams and Sequence diagrams using Rational Rose and Microsoft Visio. 
  • Extensive experience in developing applications using Java, JSP, Servlets, JavaBeans, JSTL, JSP Custom Tag Libraries, JDBC, JNDI, SQL, AJAX, JavaScript and XML. 
  • Experienced in using Agile methodologies including extreme programming, SCRUM and Test Driven Development (TDD). 
  • Proficient in integrating and configuring the Object-Relation Mapping tool, Hibernate in J2EE applications and other open source frameworks like Struts and Spring. 
  • Experience in building and deploying web applications in multiple applications servers and middleware platforms including Web logic, Web sphere, Apache Tomcat, JBoss. 
  • Experience in writing test cases in Java Environment using JUnit. 
  • Hands on experience in development of logging standards and mechanism based on Log4j. 
  • Experience in building, deploying and integrating applications with ANT, Maven. 
  • Good knowledge in Web Services, SOAP programming, WSDL, and XML parsers like SAX, DOM, AngularJS, Responsive design/Bootstrap. 
  • Demonstrated technical expertise, organization and client service skills in various projects undertaken.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Oozie, Sqoop, Zookeeper, YARN, TEZ, Flume, Spark, Kafka

Java&J2EE Technologies: Core Java, Servlets, JSP, JDBC, JNDI and Java Beans

Databases: Teradata, Oracle11g/10g, MySQL, DB2, SQL Server, NoSQL (Hbase, MongoDB)

Web Technologies: JavaScript, AJAX, HTML, XML and CSS.

Programming Languages: Java, JQuery, Scala, Python, UNIX Shell Scripting

IDE: Eclipse, NetBeans, pyCharms

Integration & Security: MuleSoft, Oracle IDM & OAM, SAML, EDI, EAI,

Build Management tools: Maven, Apache ANT, SOAP, REST

Predictive Modelling Tools: SAS Editor, SAS Enterprise guide, SAS Miner, IBM Cognos.

Scheduling Tools: Cron tab, Autosys, Ctrl M

Visualization Tools: Tableau, Arcadia Data.

PROFESSIONAL EXPERIENCE:

Confidential, Plano, TX

Hadoop/Spark Developer

Responsibilities:
  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Spark, Impala, Cassandra with Hortonworks Distribution. 
  • Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre - processing. 
  • Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and Hbase. 
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive. 
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN. 
  • Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data. 
  • Import the data from different sources like HDFS/Hbase into Spark RDD. 
  • POC on Single Member Debug on Hive/Hbase and Spark. 
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters. 
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS. 
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response. 
  • Loading Data into Hbase using Bulk Load and Non-bulk load. 
  • Experience in Oozie and workflow scheduler to manage hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows. 
  • Expertise in different data Modeling and Data Warehouse design and development. 

Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, Kfaka, Solr, HBase, Oozie, Flume, Spark - Streaming/SQL, java, SQL Scripting, Linux Shell Scripting.

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:
  • Installed and configured Hadoop Environment. 
  • Developed multiple Map - Reduce jobs in java for data cleaning and preprocessing. 
  • Installed and configured Pig and also written Pig Latin scripts. 
  • Used pig and map reduce to analyze XML files and log files. 
  • Imported data using Sqoop to load data from IBM DB2 to HDFS on regular basis. 
  • Written Hive queries for data analysis to meet the business requirements. 
  • Creating Hive tables and working on them using Hive QL. 
  • Importing and exporting data into HDFS and Hive using Sqoop from IBM DB2, Netezza Databases. 
  • Used Oozie workflow to co-ordinate pig and hive scripts. 
  • Used Impala for querying HDFS data to achieve better performance. 
  • Designed and implemented Map-Reduce based large-scale parallel relation-learning system. 
  • Setup and benchmarked Hadoop/Hbase clusters for internal use. 
  • Developed UDF's to pre-process the data and compute various metrics for reporting in both pig and hive. 
  • Developed Map Reduce program to convert mainframe fixed length data to delimited data. 
  • Data ingestion from various IBM DB2 tables to HDFS using Sqoop. 
  • Automated Python scripts to pull and synchronize the code in GitHub environment. 

Environment: Hadoop, CDH, Map Reduce, HDFS, Pig, Hive, Oozie, Java, UNIX, Flume, Impala, Hbase, Oracle, Map R AutoSys, Mainframes, JCL, IBM DB2, NDM.

Confidential, Columbus, OH

Java/Hadoop Developer

Responsibilities:
  • Responsible for business logic using java and JavaScript, JDBC for querying database. 
  • Involved in requirement analysis, design, coding and implementation. 
  • Worked in Agile Methodology and used JIRA for maintain the stories about project. 
  • Analyzed large data sets by running Hive queries. 
  • Involved in Design, develop Hive Data model, loading with data and writing Java UDF for Hive. 
  • Handled importing and exporting data into HDFS by developing solutions, analyzed the data using Map Reduce, Hive and produce summary results from Hadoop to downstream systems. 
  • Used Sqoop to import and export the data from Hadoop Distributed File System (HDFS) to RDBMS. 
  • Created Hive tables and loaded data from HDFS to Hive tables as per the requirement. 
  • Established custom Map Reduces programs in order to analyze data and used HQL queries to clean unwanted data. 
  • Created components like Hive UDFs for missing functionality in Hive to analyze and process the large volumes of data. 
  • Worked on various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins. 
  • Involved in writing complex queries to perform join operations between multiple tables. 
  • Involved actively verifying and testing data in HDFS and Hive tables while Sqooping data from Hive to RDBMS tables. 
  • Developing Scripts and Scheduled Autosy's Jobs to filter the data. 
  • Involved monitoring Auto Sys's file watcher jobs and testing data for each transaction and verified data weather it ran properly or not. 
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts 
  • Used IMPALA to pull the data from Hive tables. 
  • Used Apache Maven 3.x to build and deploy application to various environments Installed Oozie workflow engine to run multiple Hive jobs which run independently with time and data availabilities 

Environment: HDFS, Hadoop, Pig, Hive, Sqoop, Flume, Map Reduce, Oozie, Mongo DB, Java 6/7, Oracle 10g, Sub Version, Toad, UNIX Shell Scripting, SOAP, REST services, Oracle 10g, Agile Methodology, JIRA, Auto Sys

Confidential, Houston, TX

Java Developer

Responsibilities:
  • Involved in Requirements and Analysis Understanding the requirements of the client and the flow of the application as well as the application Framework. 
  • Involved in designing, developing and testing of J2EE components like Java Beans, Java, XML, Collection Framework, JSP, Servlets, JMS, JDBC, and deployments in WebLogic Server. 
  • Effectively developed Action classes, Action forms, JSP, JSF and other confs files like struts - config.xml, web.xml. 
  • Used Eclipse as Java IDE tool for creating various J2EE artifacts like Servlets, JSP's and XML. 
  • Developed interactive and dynamic web pages using hand coded semantic HTML5, CSS3, JavaScript, Bootstrap. 
  • Designed dynamic client-side JavaScript codes to build web forms and simulate process for web application, page navigation and form validation. 
  • Implemented back-end code using Spring MVC framework that handles application logic and makes calls to business objects. 
  • Developed REST Web services using JAX-RS and Jersey to perform transactions from front end to our backend applications, response is sent in JSON format based on the use cases. 
  • Used Spring, Hibernate module as an Object Relational Mapping tool for back end Operations over SQL database. Used Maven and Jenkins for building and deploying the application on the servers 
  • Provided Hibernate mapping files for mapping java objects with database tables. 
  • Database development required creation of new tables, PL/SQL stored procedures, functions, views, indexes and constraints, triggers and required SQL tuning to reduce the response time in the application. 
  • Created REST Web Services using Jersey to be consumed by other partner applications. 
  • Worked in a fast-paced AGILE development environment while supporting requirements changes and clarifications. Design and work complex application solutions by following Sprint deliverables schedule. 
  • Used Log4j for Logging various levels of information like error, info, debug into the log files. 

Environment: Core Java, J2EE, Spring, Hibernate, Oracle, HTML, CSS, XML, JavaScript, JQuery, AJAX, Angular.JS, Bootstrap, Web logic, JUnit, RESTful Web Services, Agile Methodology, Maven, GIT, Eclipse

We'd love your feedback!