We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

San Jose, CA

SUMMARY

  • Around 8+ years of professional IT experience in Analysis, Development, Integration and maintenance of Web based and Client/Server applications using Java and Big Data technologies.
  • 5+ years of experience as Hadoop Development and analysis. Worked on various technologies like Hive, Pig, Java MapReduce, UNIX, and HDFS.
  • Strong experience working with HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Flume, Kafka, Yarn, Oozie and HBase.
  • Over 2+ years of experience in development, Linux administration, implementation and maintenance of web servers and distributed Enterprise applications.
  • Experience in all phases of software development life cycle (SDLC), which includes User Interaction, Business Analysis/Modelling, Design/Architecture, Development, Implementation, Integration, Documentation, Testing, and Deployment.
  • Experience in analyzing the Business requirement and creating the hive or pig scripts to process the aggregate data.
  • Good understanding in processing of real - time data using Spark.
  • Involved in preparation of Test Plans, Test Cases & Test Scripts based on business requirements, rules, data mapping requirements and system specifications.
  • Ingest data using Sqoop from various RDBMS like Oracle, MYSQL, and Microsoft SQL Server into Hadoop HDFS.
  • Experience in implementation of Open-Source frameworks like spring, Hibernate, Web Services etc.
  • Troubleshooting issues in development and operational environments on configuration of Hadoop environments.
  • Experience in Continuous Integration and Continuous Deployment by the tools like Jenkins.
  • Experience in manipulating the streaming data to clusters through Kafka and Spark-Streaming.
  • Experience with databases such as PostgreSQL, and MySQL Server with cluster setup and writing the SQL queries Triggers & Stored Procedures.
  • Experience in data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
  • Very Good understanding and Working Knowledge of Object Oriented Programming (OOPS), Python and Scala.
  • Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Proficient in working with NoSQL database like MongoDB, Cassandra and HBase.
  • Good Knowledge in NoSQL databases HBASE (Column family DB).
  • Good knowledge on Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
  • Communicated to diverse communities of clients at offshore and onshore, dedicated to client satisfaction and quality outcomes. Extensive experience in coordinating the Offshore Development activities.
  • Highly organized and dedicated with positive Attitude, possess good time management and organizational skills with the ability to handle multiple tasks with positive attitude.
  • A team player with good interpersonal, communication and leadership skills.
  • Easily adaptable to the work conditions and can consistently deliver the quality work and capable of adapting to new technologies and facing new challenges.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop, Teradata, Map Reduce, Spark, HDFS, HBase, Pig, Hive, Sqoop, Oozie, Storm, Scala, Kafka and Flume.

Programming Languages: Java (J2SE, J2EE), C, C#, PL/SQL, Swift, SQL+, ASP.NET, JDBC, Python.

Web Development: JavaScript, jQuery, HTML 5.0, CSS 3.0, AJAX, JSON

Development Tools: Net Beans 8.0.2, Visual Studio 2013, Eclipse Neon, Android Studio, SQL developer

Testing Tools: J-Unit Testing, HP- Unified functional testing, HP- Performance Center, Selenium, win runner, Load Runner, QTP

UNIX Tools: Apache, Yum, RPM

Operating Systems: Windows, Linux, Ubuntu, Mac OS, Red Hat Linux

Protocols: TCP/IP, HTTP and HTTPS

Web Servers: Apache Tomcat

Cluster Management Tools: Cloudera Manager, Horton Works, Ambari

Methodologies: Agile, V-model, Waterfall model

Databases: HBase, MongoDB, Cassandra, Oracle 10g, MySQL, Couch, MS SQL server

PROFESSIONAL EXPERIENCE

Confidential, San Jose, CA

Hadoop Developer

Responsibilities:

  • The GVS-CS project is having multiple teams. Among them, me involved in the Data Engineering team.
  • The focus of the team is getting the data from different vendors and process dat data by using business rules.
  • After processing the data, we will send the data to the Eloqua tool.
  • Myself, involved in the Hadoop security architecture, which added different users to the same YARN queue in development and productions clusters.
  • After adding the users, we validate some jobs and checked dat the new users are allocated the same YARN queue or not in respective clusters.
  • Also involved in the security architecture for Google Platform, we are in the process of implementing dis to Google cloud projects.
  • The security architecture for Google platform is basically a two-step verification whose is going to access the cloud projects.
  • In dis project, we are spark-Sql and hive to validate large data sets with the business rules.
  • Also, involved in the discussion of the Hadoop data pipe lines automation to implement on our Hadoop.
  • In Hadoop data pipe line automation, we want to implement Jenkins to automate the Git commits when we push.
  • their are number of offers which are going live every week, and monthly based on the client requirements.
  • We are involved in cleaning the database when it is required like hive tables, python scripts etc.

Environment: Hadoop, HDFS, Hive, Python, Spark, SQL, Jenkins, UNIX Shell Scripting, Big Data, Map Reduce, Git, Eloqua.

Confidential, Plano, TX

Hadoop Developer

Responsibilities:

  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
  • Used flume, Sqoop, Hadoop, spark and Oozie for building data pipeline.
  • Cluster coordination services through Zookeeper.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
  • Experienced in managing and reviewing Hadoop log files.
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
  • Developed Oozie workflow for scheduling and orchestrating the ETL process. Designed & Implemented Java MapReduce programs to support distributed data processing.
  • Worked with highly unstructured and semi-structured data of 30TB in size (90TB with replication factor of 3).
  • Contributed towards developing a Data Pipeline to load data from different sources like Web, RDBMS, and NoSQL to Apache Kafka or Spark cluster.
  • Migrating data from Spark-RDD into HDFS and NoSQL like Cassandra/HBase.
  • Implement Pig in Pig-Latin to handle the preprocessing of data and make data regular.
  • Worked on reading multiple data formats on HDFS using PySpark.
  • Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
  • Developed MapReduce programs by using Java.
  • Worked on the core and Spark SQL modules of Spark extensive.
  • Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.

Environment: Hadoop, HDFS, Hive, Python, Scala, Spark, SQL, Teradata, UNIX Shell Scripting, Big Data, Map Reduce, Sqoop, Oozie, Pig, Flume, LINUX, Java, Eclipse.

Confidential, South Portland, ME

Hadoop Developer

Responsibilities:

  • Worked in Multi Clustered Hadoop Eco-System environment.
  • Created Map Reduce programs using Java API dat filter un-necessary records and find out unique records based on different criteria.
  • Used Unit Test Pythonlibrary for testing many Python programs and block of codes.
  • Parse JSON and XML data using Python.
  • Rewrite existing Java application in Pythonmodule to deliver certain format of data.
  • Load and transform large sets of unstructured data from UNIX system to HDFS.
  • Use Apache Scoop to dump the data user data into the HDFS on a weekly basis.
  • Created production jobs using Oozie work flows dat integrated different actions like Map Reduce, Sqoop, and Hive.
  • Involved in importing the real time data to Hadoop usingKafkaand implemented the Oozie job for Daily.
  • Involved in developing Hive DDLs to create, alter and drop Hive tables.
  • Experienced in transferring data from different data sources into HDFS systems usingKafka Producers.
  • Prepared ETL pipeline with the help of Sqoop for consumption.
  • Written PIG Scripts to analyze Hadoop logs.
  • Created tables, loading with data and writing HIVE queries which will run internally in map.
  • Troubleshoot and debug HADOOP ecosystem run-time issues.
  • Participated in all phases of SDLC includes areas of requirement gathering, analysis, estimation, design, coding, testing and documentation.
  • Developed SOAP web service as publisher/producer.
  • Developed different GUI screens JSPs using HTML, JavaScript and CSS.
  • Designed the user interface of the application using Angular JS, Bootstrap, HTML5, CSS3 and JavaScript.
  • Designed and developed front-end Graphic User Interface with JSP, HTML5, CSS3, JavaScript, and JQuery.
  • Developed entire frontend and backend modules using Pythonon Django Web Framework.
  • Developed tools using Python, Shell scripting, XML, BIG DATA to automate some of the menial tasks.
  • Performed Single Point of Technical Contact for different application teams and DEV, QA, Line Managers.

Environment: Hadoop MapReduce, HIVE, HDFS, Java, CSV files, Python, Django, Java, AWS, XML, Shell Scripting, MySQL, HTML, XHTML, Jenkins, Linux.

Confidential, Boston, MA

Hadoop Data Analyst

Responsibilities:

  • Used Hive quires and Pig scripts to analyze data.
  • Used Hive for partitioning and bucketing of data from different kind of sources to improve the performance.
  • Following agile methodology (SCRUM) during development of the project and oversee the software development by attending daily stand-ups.
  • Used Oozie to automate the flow of jobs and Zookeeper for coordination.
  • Used Flume to distribute Unstructured and semi structured data.
  • Used Sqoop to distribute structured data.
  • Wrote the Shell scripts to run the Cron Jobs to automate the data migration process from external servers and FTP sites.
  • Prepared ETL pipeline with the help of Sqoop, PIG, and HIVE to be able to frequently bring in data from the source and make it available for consumption.
  • Used Tableau for visualization and generate reports for financial data consolidation, reconciliation and segmentation.
  • Involved in loading data from UNIX file system to HDFS.
  • Created portioned tables in HIVE.
  • Developed MapReduce programs by using Java.
  • Developed various UDF's in hive for various hive scripts achieving various functionalities.
  • Implemented Kafka messaging services to stream large data and insert into database.
  • Analysed large amounts of data sets by writing Pig scripts.
  • Developed Map reduce programs for the files generated by hive query processing to generate key, value pairs and upload the data to NoSQL database HBase.

Environment: HDFS, Hive, MapReduce, Java, NoSQL, Unix, Linux, Jenkins, shell scripting, MySQL, Spreadsheet.

Confidential, Fort Worth, TX

Data Analyst

Responsibilities:

  • Communicated TEMPeffectively in both a verbal and written manner to client and offshore team.
  • Completed documentation on all assigned systems and databases, including business rules, logic, and processes.
  • Created Test data and Test Cases documentation for regression and performance.
  • Designed, built and implemented relational databases.
  • Determined changes in physical database by studying project requirements.
  • Developed intermediate business knowledge of the functional area and processed to understand the application of data information to support business function.
  • Facilitated gathering moderately complex business requirements by defining the business problem.
  • Facilitated the monthly Opportunities for Improvement (OFI) meeting.
  • Identified Opportunities for Improvement (OFI), recommended and implemented, as applicable, processed improvement plans in collaboration with identified departments.
  • Identified and addressed outliers in an efficient and professional manner following a predetermined protocol.
  • Identified data requirements and isolated data elements.
  • Leveraged a basic understanding of multiple data structures and sources.
  • Maintained and assisted in the development of moderately complex business solutions, which included data, reporting, business intelligence/analytics.
  • Maintained data dictionary by revising and entering definitions.
  • Maintained direct, timely and appropriate communication with clients.
  • Supported data governance, integrity, quality and audit functions.
  • Supported the implementation of technical data solutions and standards.
  • Utilized and prepared analysis reports summarizing Opportunities for Improvements (OFIs).
  • Worked closely with other members of the database group.

Environment: Linux, Unix, Java, spreadsheet, QlikView, SQL, Excel, shell scripting, MySQL.

Confidential

Java Developer

Responsibilities:

  • Used Eclipse as an IDE for development of the application.
  • Developed Application in Jakarta Struts Framework using MVC architecture.
  • Implemented J2EE design patterns Session Facade pattern, Singleton Pattern.
  • Created Action Forms and Action classes for the modules.
  • Customizing all the JSP pages with same look and feel using Tiles, CSS.
  • Developed JSP's to validate the information automatically using Ajax.
  • Created struts-config.xml and tiles-def.xml files.
  • Involved in coding for the presentation layer using Apache Struts, XML and JavaScript.
  • Used XSLT for UI to display XML Data.
  • Utilized JavaScript for client-side validation. Participated in designing the user interface for the application using HTML and connected them to database using JDBC.
  • Created web pages based on the requirements and styled them using CSS.
  • Involved in writing client-Side Scripts using Java Scripts and server-Side scripts using Java Beans and used Servlets for handling the business
  • Developed the Form Beans and Data Access Layer classes.
  • Involved in writing complex sub-queries and used Oracle for generating on-screen reports
  • Worked on database interaction layer for insertions, updating and retrieval operations on data.
  • Involved in deploying the application in test environment using Apache Tomcat.

Environment: JSP, Core Java, Servlets, Struts, UML, AJAX, SQL, JUNIT, JavaScript, Eclipse, JIRA, HTML, CSS.

We'd love your feedback!