We provide IT Staff Augmentation Services!

Spark Developer Resume

0/5 (Submit Your Rating)

MI

SUMMARY

  • Certified Hadoop and Spark developer with 3 years of experience in IT industry which includes, around 2 years of experience in HDFS, Hadoop MapReduce, Spark, Scala, Spark SQL, Spark Streaming, Yarn, Sqoop, Flume, Pig, Hive, HBase, ZooKeeper, Oozie, Impala, HBase, Avro and 1.2 years of experience in Software development using Java, J2EE and Database in various areas like Requirements gathering, Analysis, Design, Development, Implementation and Maintenance.
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts.
  • Very good understanding of Manual and Dynamic Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems(RDBMS) and vice - versa in different formats.
  • Experience in analyzing data using Hive QL, Pig Latin, and custom Map Reduce programs in Java.
  • Knowledge of job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Experience in creating a table in Hive metastore using Avro file format and an external schema file.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala and have a good experience in using Spark-Shell, PySpark shell and Spark Streaming.
  • Experience in using avro-tools to extract an Avro schema from a set of data files.
  • Ingested real-time and near-real time (NRT) streaming data into HDFS using Flume.
  • Experience in installation, configuration, supporting and managing - Cloudera Hadoop platform along with CDH5 clusters.
  • Experience in Web Development using frameworks - Chordiant and JSF.
  • Experience in developing applications using Java technologies include Core Java, J2EE, Java Server Pages (JSP),Servlets, Java Script.
  • Strong understanding of RDBMS concepts and data modeling and experience in relational databases such as MYSQL.
  • Well versed in using Software development methodologies like Agile Methodology and Waterfall processes.
  • Experience in preparing and executing unit test plan and unit test cases after software development using JUNIT.
  • Highly dedicated, quick starter, solution driven pattern programmer with excellent communication and interpersonal skills with ability to work as part of a team or independently.

TECHNICAL SKILLS

Hadoop Ecosystem: HDFS, MapReduce, Pig, Hive, Sqoop, Flume, Impala, Oozie, Zookeeper, Avro, Cloudera

Spark Ecosystem: Spark SQL, Spark Streaming

Databases: HBase(NoSQL),MYSQL

Programming Languages: Java, Scala, Pig Latin, Hive QL, Python, Shell Scripting

Framework: Chordiant, JSF

Web Technologies: HTML, Java script, CSS, JSP, Servlets, XML

IDE/Interfaces: Eclipse, Spark-Shell, PySpark, JUNIT

Methodologies: Agile, Waterfall

Operating Systems: Windows, CentOS, LINUX, OSX

PROFESSIONAL EXPERIENCE

Confidential, MI

Spark Developer

Responsibilities:

  • Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
  • Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format.
  • Used Spark SQL to process the huge amount of structured data.
  • Assigned name to each of the columns using case class option in Scala.
  • Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Used DataFrame API in Scala for converting the distributed collection of data organized into named columns.
  • Registered the datasets as Hive Table.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Developed solutions to pre-process large sets of structured, with different file formats (Text file, Avro data files, Sequence files, Xml and JSON files, ORC and Parquet).
  • Experienced with batch processing of data sources using Apache Spark.
  • Developing predictive analytic using Apache Spark Scala APIs.

Environment: HDFS, YARN, Sqoop, Spark-SQL, SQL Scripting, Cloudera CDH 5.X,Spark-shell, Hive

Confidential, MI

Hadoop Developer

Responsibilities:

  • Fetched the data into a Hadoop Distributed File System and analyze it with the help of MapReduce, Pig and Hive to find the top rated links based on the user comments, likes etc.
  • Extracted the needed data from server and into HDFS and bulk loaded the cleaned data into HBase.
  • Implemented MapReduce programs to handle semi/ unstructured data like XML, JSON, Avro data files and sequence files forlogfiles.
  • Pushed the output HBase and then feed it into PIG, which splits the data into different parts.
  • Implemented CDH3 Hadoop cluster on CentOS, assisted with performance tuning and monitoring.
  • Used Hive to analyze data ingested into HBase and compute various metrics for reporting on the dashboard.
  • Integrated MapReduce with HBase to import bulk amount of data into HBase using MapReduce programs.
  • Used MapReduce, to convert the semi-structured format (XML data) into a structured format and categorize the user rating as positive and negative for each of the thousand links.
  • Developed job flows in Oozie to automate the workflow for Pig and Hive jobs
  • Used Hive Query to analyze the data further and push the output into relational database (RDBMS) using Sqoop.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Oozie, LINUX, Hue, HBase

Confidential

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce.
  • Used compression techniques to save data and optimize data transfer over network using Snappy, etc.
  • Written PIG script that will take input aslogfiles and parse the logs and structure them in tabular format to facilitate effective querying on thelogdata.
  • Stored the data in tabular formats using Hive tables and Hive SerDe.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Process and analyze the data from Hive tables using HiveQL.
  • Involved in collecting and aggregating large amounts of log data from multiple projects using Apache Flume and staging data in HDFS for further analysis.
  • Experience working on processing structured data using Pig and Hive.
  • Gained experience in managing and reviewing Hadoop log files.

Environment: Hadoop, HDFS, Pig, Hive, MapReduce, Sqoop, Oozie, LINUX, Hue

Confidential

Java Developer

Responsibilities:

  • Involved in complete Software Development Life cycle (SDLC) of the project from Analysis, Design, Programming, Testing and Deploying the application.
  • Used Java Server Faces (JSF) and Java Server Pages (JSP) for developing UI pages.
  • Developed HTML+ Java script prototypes for the Savings re-engineering screens.
  • Developed web application using Chordiant MVC Framework.
  • Generated Web Service Client from a Web Services Description Language(WSDL).
  • Used JUNIT for unit testing the code changes.
  • Involved in testing of application on various levels like Unit and Integration testing.
  • Provided support for SIT and UAT.
  • Used Rational Clear Case / Clear Quest for source control and defect management.
  • Submitted the unit tested code for Review and rectified the concerns raised.
  • Interact with business people, SMEs, onshore partners and other supporting teams through mails, voice calls, video conference calls, online communicator chats to ensure the delivery of an optimized solution to the customer.
  • Fixed the defects in SIT phase immediately and provided build with minimal turnaround time.
  • Used Quality Center (QC) for requirements and bug tracking.

Environment: Core Java, Chordiant MVC, JSF, JSP, Servlets, Oracle SQL, Windows XP/Vista, RAD, TOAD, WebSphere Application Server

Confidential

Java Developer

Responsibilities:

  • Involved in developing easy-to-use user interface and step-by-step process that helps to create eye-catching E-Mail campaign.
  • Design and developed web-based software using Java Server Faces(JSF) framework.
  • Data has been handled using the JDBC connection.
  • Java Beans were used to handle business logic as a Model and Servlets to control the flow of application as Controller.
  • Java Servlets were used as the common gateway interface between the client and server.
  • Proficient in developing responsive Front End components using HTML, CSS, JSP tags, JavaScript.
  • Used IBM's WebSphere Application Server to deploy code.
  • Performed Unit Testing using JUnit.
  • Involved in Code Reviews, Defect Fixing and UAT support.

Environment: HTML, CSS, Servlets, JSF, JSP, JUNIT, Oracle 11g, Eclipse, JavaScript, Core Java

We'd love your feedback!