Hadoop Developer Resume San Jose, CA - Hire IT People

SUMMARY

Around 8+ years of professional IT experience in Analysis, Development, Integration and maintenance of Web based and Client/Server applications using Java and Big Data technologies.
5+ years of experience as Hadoop Development and analysis. Worked on various technologies like Hive, Pig, Java MapReduce, UNIX, and HDFS.
Strong experience working with HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Flume, Kafka, Yarn, Oozie and HBase.
Over 2+ years of experience in development, Linux administration, implementation and maintenance of web servers and distributed Enterprise applications.
Experience in all phases of software development life cycle (SDLC), which includes User Interaction, Business Analysis/Modelling, Design/Architecture, Development, Implementation, Integration, Documentation, Testing, and Deployment.
Experience in analyzing the Business requirement and creating the hive or pig scripts to process the aggregate data.
Good understanding in processing of real - time data using Spark.
Involved in preparation of Test Plans, Test Cases & Test Scripts based on business requirements, rules, data mapping requirements and system specifications.
Ingest data using Sqoop from various RDBMS like Oracle, MYSQL, and Microsoft SQL Server into Hadoop HDFS.
Experience in implementation of Open-Source frameworks like spring, Hibernate, Web Services etc.
Troubleshooting issues in development and operational environments on configuration of Hadoop environments.
Experience in Continuous Integration and Continuous Deployment by the tools like Jenkins.
Experience in manipulating the streaming data to clusters through Kafka and Spark-Streaming.
Experience with databases such as PostgreSQL, and MySQL Server with cluster setup and writing the SQL queries Triggers & Stored Procedures.
Experience in data processing like collecting, aggregating, moving from various sources using Apache Flume and Kafka.
Very Good understanding and Working Knowledge of Object Oriented Programming (OOPS), Python and Scala.
Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Proficient in working with NoSQL database like MongoDB, Cassandra and HBase.
Good Knowledge in NoSQL databases HBASE (Column family DB).
Good knowledge on Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
Communicated to diverse communities of clients at offshore and onshore, dedicated to client satisfaction and quality outcomes. Extensive experience in coordinating the Offshore Development activities.
Highly organized and dedicated with positive Attitude, possess good time management and organizational skills with the ability to handle multiple tasks with positive attitude.
A team player with good interpersonal, communication and leadership skills.
Easily adaptable to the work conditions and can consistently deliver the quality work and capable of adapting to new technologies and facing new challenges.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop, Teradata, Map Reduce, Spark, HDFS, HBase, Pig, Hive, Sqoop, Oozie, Storm, Scala, Kafka and Flume.

Programming Languages: Java (J2SE, J2EE), C, C#, PL/SQL, Swift, SQL+, ASP.NET, JDBC, Python.

Web Development: JavaScript, jQuery, HTML 5.0, CSS 3.0, AJAX, JSON

Development Tools: Net Beans 8.0.2, Visual Studio 2013, Eclipse Neon, Android Studio, SQL developer

Testing Tools: J-Unit Testing, HP- Unified functional testing, HP- Performance Center, Selenium, win runner, Load Runner, QTP

UNIX Tools: Apache, Yum, RPM

Operating Systems: Windows, Linux, Ubuntu, Mac OS, Red Hat Linux

Protocols: TCP/IP, HTTP and HTTPS

Web Servers: Apache Tomcat

Cluster Management Tools: Cloudera Manager, Horton Works, Ambari

Methodologies: Agile, V-model, Waterfall model

Databases: HBase, MongoDB, Cassandra, Oracle 10g, MySQL, Couch, MS SQL server

PROFESSIONAL EXPERIENCE

Confidential, San Jose, CA

Hadoop Developer

Responsibilities:

The GVS-CS project is having multiple teams. Among them, me involved in the Data Engineering team.
The focus of the team is getting the data from different vendors and process dat data by using business rules.
After processing the data, we will send the data to the Eloqua tool.
Myself, involved in the Hadoop security architecture, which added different users to the same YARN queue in development and productions clusters.
After adding the users, we validate some jobs and checked dat the new users are allocated the same YARN queue or not in respective clusters.
Also involved in the security architecture for Google Platform, we are in the process of implementing dis to Google cloud projects.
The security architecture for Google platform is basically a two-step verification whose is going to access the cloud projects.
In dis project, we are spark-Sql and hive to validate large data sets with the business rules.
Also, involved in the discussion of the Hadoop data pipe lines automation to implement on our Hadoop.
In Hadoop data pipe line automation, we want to implement Jenkins to automate the Git commits when we push.
their are number of offers which are going live every week, and monthly based on the client requirements.
We are involved in cleaning the database when it is required like hive tables, python scripts etc.

Environment: Hadoop, HDFS, Hive, Python, Spark, SQL, Jenkins, UNIX Shell Scripting, Big Data, Map Reduce, Git, Eloqua.

Confidential, Plano, TX

Hadoop Developer

Responsibilities:

Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
Used flume, Sqoop, Hadoop, spark and Oozie for building data pipeline.
Cluster coordination services through Zookeeper.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
Experienced in managing and reviewing Hadoop log files.
Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
Developed Oozie workflow for scheduling and orchestrating the ETL process. Designed & Implemented Java MapReduce programs to support distributed data processing.
Worked with highly unstructured and semi-structured data of 30TB in size (90TB with replication factor of 3).
Contributed towards developing a Data Pipeline to load data from different sources like Web, RDBMS, and NoSQL to Apache Kafka or Spark cluster.
Migrating data from Spark-RDD into HDFS and NoSQL like Cassandra/HBase.
Implement Pig in Pig-Latin to handle the preprocessing of data and make data regular.
Worked on reading multiple data formats on HDFS using PySpark.
Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
Developed MapReduce programs by using Java.
Worked on the core and Spark SQL modules of Spark extensive.
Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.

Environment: Hadoop, HDFS, Hive, Python, Scala, Spark, SQL, Teradata, UNIX Shell Scripting, Big Data, Map Reduce, Sqoop, Oozie, Pig, Flume, LINUX, Java, Eclipse.

Confidential, South Portland, ME

Hadoop Developer

Responsibilities:

Worked in Multi Clustered Hadoop Eco-System environment.
Created Map Reduce programs using Java API dat filter un-necessary records and find out unique records based on different criteria.
Used Unit Test Pythonlibrary for testing many Python programs and block of codes.
Parse JSON and XML data using Python.
Rewrite existing Java application in Pythonmodule to deliver certain format of data.
Load and transform large sets of unstructured data from UNIX system to HDFS.
Use Apache Scoop to dump the data user data into the HDFS on a weekly basis.
Created production jobs using Oozie work flows dat integrated different actions like Map Reduce, Sqoop, and Hive.
Involved in importing the real time data to Hadoop usingKafkaand implemented the Oozie job for Daily.
Involved in developing Hive DDLs to create, alter and drop Hive tables.
Experienced in transferring data from different data sources into HDFS systems usingKafka Producers.
Prepared ETL pipeline with the help of Sqoop for consumption.
Written PIG Scripts to analyze Hadoop logs.
Created tables, loading with data and writing HIVE queries which will run internally in map.
Troubleshoot and debug HADOOP ecosystem run-time issues.
Participated in all phases of SDLC includes areas of requirement gathering, analysis, estimation, design, coding, testing and documentation.
Developed SOAP web service as publisher/producer.
Developed different GUI screens JSPs using HTML, JavaScript and CSS.
Designed the user interface of the application using Angular JS, Bootstrap, HTML5, CSS3 and JavaScript.
Designed and developed front-end Graphic User Interface with JSP, HTML5, CSS3, JavaScript, and JQuery.
Developed entire frontend and backend modules using Pythonon Django Web Framework.
Developed tools using Python, Shell scripting, XML, BIG DATA to automate some of the menial tasks.
Performed Single Point of Technical Contact for different application teams and DEV, QA, Line Managers.

Environment: Hadoop MapReduce, HIVE, HDFS, Java, CSV files, Python, Django, Java, AWS, XML, Shell Scripting, MySQL, HTML, XHTML, Jenkins, Linux.

Confidential, Boston, MA

Hadoop Data Analyst

Responsibilities:

Used Hive quires and Pig scripts to analyze data.
Used Hive for partitioning and bucketing of data from different kind of sources to improve the performance.
Following agile methodology (SCRUM) during development of the project and oversee the software development by attending daily stand-ups.
Used Oozie to automate the flow of jobs and Zookeeper for coordination.
Used Flume to distribute Unstructured and semi structured data.
Used Sqoop to distribute structured data.
Wrote the Shell scripts to run the Cron Jobs to automate the data migration process from external servers and FTP sites.
Prepared ETL pipeline with the help of Sqoop, PIG, and HIVE to be able to frequently bring in data from the source and make it available for consumption.
Used Tableau for visualization and generate reports for financial data consolidation, reconciliation and segmentation.
Involved in loading data from UNIX file system to HDFS.
Created portioned tables in HIVE.
Developed MapReduce programs by using Java.
Developed various UDF's in hive for various hive scripts achieving various functionalities.
Implemented Kafka messaging services to stream large data and insert into database.
Analysed large amounts of data sets by writing Pig scripts.
Developed Map reduce programs for the files generated by hive query processing to generate key, value pairs and upload the data to NoSQL database HBase.

Environment: HDFS, Hive, MapReduce, Java, NoSQL, Unix, Linux, Jenkins, shell scripting, MySQL, Spreadsheet.

Confidential, Fort Worth, TX

Data Analyst

Responsibilities:

Communicated TEMPeffectively in both a verbal and written manner to client and offshore team.
Completed documentation on all assigned systems and databases, including business rules, logic, and processes.
Created Test data and Test Cases documentation for regression and performance.
Designed, built and implemented relational databases.
Determined changes in physical database by studying project requirements.
Developed intermediate business knowledge of the functional area and processed to understand the application of data information to support business function.
Facilitated gathering moderately complex business requirements by defining the business problem.
Facilitated the monthly Opportunities for Improvement (OFI) meeting.
Identified Opportunities for Improvement (OFI), recommended and implemented, as applicable, processed improvement plans in collaboration with identified departments.
Identified and addressed outliers in an efficient and professional manner following a predetermined protocol.
Identified data requirements and isolated data elements.
Leveraged a basic understanding of multiple data structures and sources.
Maintained and assisted in the development of moderately complex business solutions, which included data, reporting, business intelligence/analytics.
Maintained data dictionary by revising and entering definitions.
Maintained direct, timely and appropriate communication with clients.
Supported data governance, integrity, quality and audit functions.
Supported the implementation of technical data solutions and standards.
Utilized and prepared analysis reports summarizing Opportunities for Improvements (OFIs).
Worked closely with other members of the database group.

Environment: Linux, Unix, Java, spreadsheet, QlikView, SQL, Excel, shell scripting, MySQL.

Confidential

Java Developer

Responsibilities:

Used Eclipse as an IDE for development of the application.
Developed Application in Jakarta Struts Framework using MVC architecture.
Implemented J2EE design patterns Session Facade pattern, Singleton Pattern.
Created Action Forms and Action classes for the modules.
Customizing all the JSP pages with same look and feel using Tiles, CSS.
Developed JSP's to validate the information automatically using Ajax.
Created struts-config.xml and tiles-def.xml files.
Involved in coding for the presentation layer using Apache Struts, XML and JavaScript.
Used XSLT for UI to display XML Data.
Utilized JavaScript for client-side validation. Participated in designing the user interface for the application using HTML and connected them to database using JDBC.
Created web pages based on the requirements and styled them using CSS.
Involved in writing client-Side Scripts using Java Scripts and server-Side scripts using Java Beans and used Servlets for handling the business
Developed the Form Beans and Data Access Layer classes.
Involved in writing complex sub-queries and used Oracle for generating on-screen reports
Worked on database interaction layer for insertions, updating and retrieval operations on data.
Involved in deploying the application in test environment using Apache Tomcat.

Environment: JSP, Core Java, Servlets, Struts, UML, AJAX, SQL, JUNIT, JavaScript, Eclipse, JIRA, HTML, CSS.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

San Jose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship