Hadoop/Spark Developer Resume San Diego, CA - Hire IT People

SUMMARY:

IT Professional with 8+ years of experience in JAVA, J2EE and Big Data Technologies, that includes 4 plus years of strong experience in Big Data and Hadoop Stack.
Good understanding of HDFS Designs, Daemons, federation and HDFS High Availability (HA).
Well versed in designing and implementing MapReduce jobs using JAVA on Eclipse to solve real world scaling problems.
Hands on experience on structured and unstructured data with various file formats such as xml files, JSON files, sequence files using MapReduce programs.
Extensive experience with writing SQL queries using HiveQL to perform analytics on structured data.
Expertise in Data load management, importing & exporting data using SQOOP and FLUME.
Experience in developing Spark application using Spark Core, Spark SQL and Spark Streaming API's.
Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using Hadoop distributions: Cloudera CDH
Hands on experience developing coding standards for Cloudera and Cloud Computing standards for enterprise big data applications.
Good experience using Apache Spark, Storm and Kafka.
Good Knowledge on Spark framework on both batch and real time data processing.
Hands on experience in creating Hive UDF's for the requirements and to handle JSon and xml files.
Strong on delivering projects (full SDLC) using big data technologies like Hadoop, Oozie and NoSQL
Extensive knowledge in setting up and running Clusters, monitoring, Data analytics, Sentiment analysis, Predictive analysis, Data presentation with big data world.
Excellent understanding of NoSQL databases like HBase.
Experience in JAVA programming with skills in analysis, design, testing and deploying with various technologies like Java, J2EE, JavaScript and Data Structures, JDBC, HTML, XML, JUnit, JQuery.
Excellent interpersonal and communication skills, creative, research - minded, technically competent and result-oriented with problem solving and leadership skills

TECHNICAL SKILLS:

Big Data Ecosystems: MapReduce, Hive, Sqoop, Spark, Kafka, Pig, Flume, HBase, Oozie

Streaming Technologies: Spark Streaming, Storm

Scripting Languages: Python, Bash, Java Scripting, HTML5, CSS3

Programming Languages: Java, Scala, SQL, PL/SQL

Java/J2EE Technologies: Servlets, JSP, JSF, JUnit, Hibernate, Log4J, EJB, JDBC, JMS, JNDI

Databases: Oracle, RDBMS, NoSQL

IDEs / Tools: Eclipse, JUnit, Maven, Ant, MS Visual Studio, Net Beans

Methodologies: Agile, Waterfall

PROFESSIONAL EXPERIENCE:

Confidential - San Diego, CA

Hadoop/Spark Developer

Responsibilities:

Worked on migrating MapReduce programs into Spark transformations using Spark and Scala.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Imported the data from different sources like HDFS/HBase into Spark RDD, developed a data pipeline using Kafka to store data into HDFS. Performed real time analysis on the incoming data.
Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems like Oracle and MySQL.
Involved in converting Hive or SQL queries into Spark transformations using Python and Scala.
Built Kafka Rest API to collect events from front end.
Built real time pipeline for streaming data using Kafka and Spark Streaming.
Exploring Spark and improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, and Pair RDD's.
Performance optimization when dealing with large datasets using partitions, broadcasts in Spark, effective and efficient joins, transformations during ingestion process.
Used Spark for interactive queries, processing of streaming data and integration with HBase database for huge volume of data.
Stored the data in tabular formats using Hive tables and Hive Serdes.
Used Spark API over Hadoop YARN to perform analytics on data in Hive.
Developed custom writable MapReduce Java programs to load web server logs into HBase using Flume.
Redesigned the HBase tables to improve the performance according to the query requirements.
Developed MapReduce jobs to convert data files into Parquet file format.
Developed Hive queries for data sampling and analysis to the analysts.
Executed Hive queries that helped in analysis of trends by comparing the new data with existing data warehouse reference tables and historical data.
Developed Hive User Defined Functions in java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries.
Worked on Sequence files, ORC files, Map side joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
Used OOZIE engine for creating workflow and coordinator jobs that schedule and execute various
Hadoop jobs such as MapReduce Jobs, Hive, Spark and Sqoop operations.
Configured Oozie workflow to run multiple Hive jobs which run independently with time and data availability.
Optimized MapReduce code, performance tuning and analysis.

Environment: CDH5, Eclipse, Centos Linux, HDFS, MapReduce, Kafka, Python, Scala, Parquet, Hive, Sqoop, Spark, Spark-SQL, Spark-Streaming, HBase, Oracle, Oozie, Red Hat Linux

Confidential - Minneapolis, MN

Hadoop/Spark Developer

Responsibilities:

Involved in installation & configuration of a Hadoop cluster along with Hive.
Developed spark scripts by using Scala shell as per requirements.
Worked with spark core, Spark Streaming and spark SQL modules of Spark
Developed multiple POCs using Spark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Developed Kafka producer and consumers, Cassandra clients and Spark along with components on HDFS, Hive.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
Developed the code for Importing and exporting data into HDFS and Hive using Sqoop.
Responsible for writing Hive Queries for analyzing data in Hive warehouse using HQL.
Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs by directed.
Developing Hive User Defined Functions in java, compiling them into jars and adding them to the HDFS and executing them with Hive Queries.
Involved in managing and reviewing Hadoop log files.
Tested and reported defects in an Agile Methodology perspective.
Installed Hadoop ecosystems (Hive, Pig, Sqoop, HBase, Oozie) on top of Hadoop cluster
Involved in importing data from SQL to HDFS and Hive for analytical purpose.
Implemented the workflows using Oozie framework to automate tasks.

Environment: Hadoop, HDFS, Spark, MapReduce, Hive, Oozie, Java, NoSQL, Cloudera, MySQL, SQL

Confidential - Houston, TX

Hadoop Developer

Responsibilities:

Developed parser and loader MapReduce application to retrieve data from HDFS and store to HBase and Hive.
Imported the unstructured data into the HDFS using Flume.
Used Oozie to orchestrate the map reduce jobs that extract the data on a timely manner.
Wrote MapReduce java programs to analyze the log data for large-scale data sets.
Extensively used HBase Java API on Java application.
Automated the jobs for extracting the data from different Data Sources like MySQL to pushing the result set data to Hadoop Distributed File System.
Implemented Map Reduce jobs using Java API and PIG Latin as well HIVEQL
Participated in the setup and deployment of Hadoop cluster.
Involved in design & development of an application using Hive (UDF).
Handled Importing and exporting Data from MySQL/Oracle to HiveQL Using SQOOP.
Designed and built many applications to deal with vast amounts of data flowing through multiple Hadoop clusters, using Pig Latin and Java-based map-reduce.
Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.

Environment: Hadoop, Hive, Hue Tool, Zookeeper, Map Reduce, Sqoop, Pig, HCatalog, UNIX, Java, Eclipse, Oracle, SQL Server, MySQL

Confidential - Houston, TX

Java Developer

Responsibilities:

Followed Agile Rational Unified Process throughout the lifecycle of the project.
Involved in requirements analysis and gathering and converting them into technical specifications using UML diagrams.
Implemented core framework components for executing workflows using Core Java, JDBC, and Servlets & JSPs.
Created Responsive Designs (Mobile/Tablet/Desktop) using HTML5& CSS3.
Applied Object Oriented concepts (inheritance, composition, interface), design patterns (singleton, strategy)
Applied Spring IOC Container to facilitate Dependency Injection.
Used Spring AOP to implement security, where cross cutting concerns were identified.
Responsible for developing SOAP based Web Services consuming and packaging using Axis.
Involved in design and decision making for Hibernate ORM mapping.
Responsible for designing front end system using JSP, HTML, jQuery and JavaScript.
Involved in Managing Web Services and operations.
Used Maven as the build tool to manage the dependencies of the project.
Implemented Stored Procedures for the tables in the database.
Develop and perform Mock Testing and Unit Testing using JUNIT and Easy Mock.
Involved in implementing the continuous integration using Jenkins
Built project using Apache Maven build scripts.

Environment: Java1.6/J2EE, Microsoft Visio, Web Sphere Application Server, Spring MVC, IOC, Spring AOP, Apache Axis, SOAP, Hibernate, Web services, Maven, jQuery, JUnit, Easy Mockito, Git, Jenkins

Confidential

Java Developer

Responsibilities:

Involved in various phases of Software Development Life Cycle (SDLC).
Used Rational Rose for the Use Case Diagrams, Object Diagrams, class Diagrams and Sequence diagrams to represent the detailed design phase.
Front-end is designed by using HTML, CSS, JSP, Servlets, Ajax and Struts.
Involved in writing the exception and validation classes using Struts validation rules.
Used JavaScript for the web page validation.
Used SOAP for Web Services by exchanging XML data between applications over HTTP.
Created Servlets which route submittals to appropriate Enterprise Java Bean (EJB) components and render retrieved information.
Written ANT scripts for building application artifacts.

Environment: Java, J2EE, Servlets, Struts, JSP, XML, DOM, HTML, JavaScript, JDBC, Web Services, Eclipse Plug-ins.

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

San Diego, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship