We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

3.00/5 (Submit Your Rating)

San Diego, CA

SUMMARY:

  • IT Professional with 8+ years of experience in JAVA, J2EE and Big Data Technologies, that includes 4 plus years of strong experience in Big Data and Hadoop Stack.
  • Good understanding of HDFS Designs, Daemons, federation and HDFS High Availability (HA).
  • Well versed in designing and implementing MapReduce jobs using JAVA on Eclipse to solve real world scaling problems.
  • Hands on experience on structured and unstructured data with various file formats such as xml files, JSON files, sequence files using MapReduce programs.
  • Extensive experience with writing SQL queries using HiveQL to perform analytics on structured data.
  • Expertise in Data load management, importing & exporting data using SQOOP and FLUME.
  • Experience in developing Spark application using Spark Core, Spark SQL and Spark Streaming API's.
  • Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using Hadoop distributions: Cloudera CDH
  • Hands on experience developing coding standards for Cloudera and Cloud Computing standards for enterprise big data applications.
  • Good experience using Apache Spark, Storm and Kafka.
  • Good Knowledge on Spark framework on both batch and real time data processing.
  • Hands on experience in creating Hive UDF's for the requirements and to handle JSon and xml files.
  • Strong on delivering projects (full SDLC) using big data technologies like Hadoop, Oozie and NoSQL
  • Extensive knowledge in setting up and running Clusters, monitoring, Data analytics, Sentiment analysis, Predictive analysis, Data presentation with big data world.
  • Excellent understanding of NoSQL databases like HBase.
  • Experience in JAVA programming with skills in analysis, design, testing and deploying with various technologies like Java, J2EE, JavaScript and Data Structures, JDBC, HTML, XML, JUnit, JQuery.
  • Excellent interpersonal and communication skills, creative, research - minded, technically competent and result-oriented with problem solving and leadership skills

TECHNICAL SKILLS:

Big Data Ecosystems: MapReduce, Hive, Sqoop, Spark, Kafka, Pig, Flume, HBase, Oozie

Streaming Technologies: Spark Streaming, Storm

Scripting Languages: Python, Bash, Java Scripting, HTML5, CSS3

Programming Languages: Java, Scala, SQL, PL/SQL

Java/J2EE Technologies: Servlets, JSP, JSF, JUnit, Hibernate, Log4J, EJB, JDBC, JMS, JNDI

Databases: Oracle, RDBMS, NoSQL

IDEs / Tools: Eclipse, JUnit, Maven, Ant, MS Visual Studio, Net Beans

Methodologies: Agile, Waterfall

PROFESSIONAL EXPERIENCE:

Confidential - San Diego, CA

Hadoop/Spark Developer

Responsibilities:

  • Worked on migrating MapReduce programs into Spark transformations using Spark and Scala.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Imported the data from different sources like HDFS/HBase into Spark RDD, developed a data pipeline using Kafka to store data into HDFS. Performed real time analysis on the incoming data.
  • Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems like Oracle and MySQL.
  • Involved in converting Hive or SQL queries into Spark transformations using Python and Scala.
  • Built Kafka Rest API to collect events from front end.
  • Built real time pipeline for streaming data using Kafka and Spark Streaming.
  • Exploring Spark and improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, and Pair RDD's.
  • Performance optimization when dealing with large datasets using partitions, broadcasts in Spark, effective and efficient joins, transformations during ingestion process.
  • Used Spark for interactive queries, processing of streaming data and integration with HBase database for huge volume of data.
  • Stored the data in tabular formats using Hive tables and Hive Serdes.
  • Used Spark API over Hadoop YARN to perform analytics on data in Hive.
  • Developed custom writable MapReduce Java programs to load web server logs into HBase using Flume.
  • Redesigned the HBase tables to improve the performance according to the query requirements.
  • Developed MapReduce jobs to convert data files into Parquet file format.
  • Developed Hive queries for data sampling and analysis to the analysts.
  • Executed Hive queries that helped in analysis of trends by comparing the new data with existing data warehouse reference tables and historical data.
  • Developed Hive User Defined Functions in java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries.
  • Worked on Sequence files, ORC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
  • Used OOZIE engine for creating workflow and coordinator jobs that schedule and execute various

    Hadoop jobs such as MapReduce Jobs, Hive, Spark and Sqoop operations.

  • Configured Oozie workflow to run multiple Hive jobs which run independently with time and data availability.
  • Optimized MapReduce code, performance tuning and analysis.

Environment: CDH5, Eclipse, Centos Linux, HDFS, MapReduce, Kafka, Python, Scala, Parquet, Hive, Sqoop, Spark, Spark-SQL, Spark-Streaming, HBase, Oracle, Oozie, Red Hat Linux

Confidential - Minneapolis, MN

Hadoop/Spark Developer

Responsibilities:

  • Involved in installation & configuration of a Hadoop cluster along with Hive.
  • Developed spark scripts by using Scala shell as per requirements.
  • Worked with spark core, Spark Streaming and spark SQL modules of Spark
  • Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Developed Kafka producer and consumers, Cassandra clients and Spark along with components on HDFS, Hive.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Developed the code for Importing and exporting data into HDFS and Hive using Sqoop.
  • Responsible for writing Hive Queries for analyzing data in Hive warehouse using HQL.
  • Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs by directed.
  • Developing Hive User Defined Functions in java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries.
  • Involved in managing and reviewing Hadoop log files.
  • Tested and reported defects in an Agile Methodology perspective.
  • Installed Hadoop ecosystems (Hive, Pig, Sqoop, HBase, Oozie) on top of Hadoop cluster
  • Involved in importing data from SQL to HDFS and Hive for analytical purpose.
  • Implemented the workflows using Oozie framework to automate tasks.

Environment: Hadoop, HDFS, Spark, MapReduce, Hive, Oozie, Java, NoSQL, Cloudera, MySQL, SQL

Confidential - Houston, TX

Hadoop Developer

Responsibilities:

  • Developed parser and loader MapReduce application to retrieve data from HDFS and store to HBase and Hive.
  • Imported the unstructured data into the HDFS using Flume.
  • Used Oozie to orchestrate the map reduce jobs that extract the data on a timely manner.
  • Wrote MapReduce java programs to analyze the log data for large-scale data sets.
  • Extensively used HBase Java API on Java application.
  • Automated the jobs for extracting the data from different Data Sources like MySQL to pushing the result set data to Hadoop Distributed File System.
  • Implemented Map Reduce jobs using Java API and PIG Latin as well HIVEQL
  • Participated in the setup and deployment of Hadoop cluster.
  • Involved in design & development of an application using Hive (UDF).
  • Handled Importing and exporting Data from MySQL/Oracle to HiveQL Using SQOOP.
  • Designed and built many applications to deal with vast amounts of data flowing through multiple Hadoop clusters, using Pig Latin and Java-based map-reduce.
  • Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
  • Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.

Environment: Hadoop, Hive, Hue Tool, Zookeeper, Map Reduce, Sqoop, Pig, HCatalog, UNIX, Java, Eclipse, Oracle, SQL Server, MySQL

Confidential - Houston, TX

Java Developer

Responsibilities:

  • Followed Agile Rational Unified Process throughout the lifecycle of the project.
  • Involved in requirements analysis and gathering and converting them into technical specifications using UML diagrams.
  • Implemented core framework components for executing workflows using Core Java, JDBC, and Servlets & JSPs.
  • Created Responsive Designs (Mobile/Tablet/Desktop) using HTML5& CSS3.
  • Applied Object Oriented concepts (inheritance, composition, interface), design patterns (singleton, strategy)
  • Applied Spring IOC Container to facilitate Dependency Injection.
  • Used Spring AOP to implement security, where cross cutting concerns were identified.
  • Responsible for developing SOAP based Web Services consuming and packaging using Axis.
  • Involved in design and decision making for Hibernate ORM mapping.
  • Responsible for designing front end system using JSP, HTML, jQuery and JavaScript.
  • Involved in Managing Web Services and operations.
  • Used Maven as the build tool to manage the dependencies of the project.
  • Implemented Stored Procedures for the tables in the database.
  • Develop and perform Mock Testing and Unit Testing using JUNIT and Easy Mock.
  • Involved in implementing the continuous integration using Jenkins
  • Built project using Apache Maven build scripts.

Environment: Java1.6/J2EE, Microsoft Visio, Web Sphere Application Server, Spring MVC, IOC, Spring AOP, Apache Axis, SOAP, Hibernate, Web services, Maven, jQuery, JUnit, Easy Mockito, Git, Jenkins

Confidential

Java Developer

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC).
  • Used Rational Rose for the Use Case Diagrams, Object Diagrams, class Diagrams and Sequence diagrams to represent the detailed design phase.
  • Front-end is designed by using HTML, CSS, JSP, Servlets, Ajax and Struts.
  • Involved in writing the exception and validation classes using Struts validation rules.
  • Used JavaScript for the web page validation.
  • Used SOAP for Web Services by exchanging XML data between applications over HTTP.
  • Created Servlets which route submittals to appropriate Enterprise Java Bean (EJB) components and render retrieved information.
  • Written ANT scripts for building application artifacts.

Environment: Java, J2EE, Servlets, Struts, JSP, XML, DOM, HTML, JavaScript, JDBC, Web Services, Eclipse Plug-ins.

We'd love your feedback!