We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

0/5 (Submit Your Rating)

San Jose, CA

SUMMARY

  • 7+ years of overall IT experience in a variety of industries, which includes hands on experience of 4 years in Big Data technologies and extensive experience of 3+ years as Java Developer
  • In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and MapReduce concepts and experience in working with MapReduce programs using Apache Hadoop for working with Big Data to analyze large data sets efficiently.
  • Hands on experience in working with Ecosystems like Hive, Pig, Sqoop, Map Reduce, Flume, Oozie.Strong knowledge of Pig and Hive’s analytical functions, extending Hive and Pig core functionality by writing custom UDFs.
  • Experience in importing and exporting terra bytes of data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experience in ETL performance tuning of sources, transformations, targets, mappings, work lets, workflows, and sessions.
  • Strong DWH & ETL experience in Informatica Power Center 9.6.1/9.1/8.6.1 development, design, process and best practices experience as ETL Analyst and Developer in US.
  • Knowledge of job workflow scheduling and monitoring tools like oozie and Zookeeper, of NoSQL databases such as HBase, Cassandra, and of administrative tasks such as installing Hadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive and Pig.
  • Experience in design, development and testing of Distributed, Internet/Intranet/E-Commerce, Client/Server and Database applications mainly using technologies Java, EJB, Servlets, JDBC, JSP, Struts, Hibernate, Spring, JavaScript on WebLogic, Apache Tomcat Web/Application Servers and with Oracle and SQL Server Databases on Unix, windows NT platforms
  • Extensive work experience in Object Oriented Analysis and Design, Java/J2EE technologies including HTML, XHTML, DHTML, JavaScript, JSTL, CSS, AJAX and Oracle for developing server side applications and user interfaces.
  • Implemented complex business rules in Informatica by creating Reusable transformations to reduce the development time and complexity of mappings.
  • Experience in developing Middle-tier components in distributed transaction management system using Java. Good understanding of XML methodologies (XML, XSL, XSD) including Web Services and SOAP.
  • Skilful experience inPythonby developing software's using new tools, libraries used: libraries- BeautifulSoup, numpy, Scipy, PySide,python-twitter, matplotlib, Pickle, Pandas data-frame, urllib2, MySQL dB for database connectivity) to improvise software development process.
  • 2+ years’ experience inPythonin creating scalable and robust applications along with other technologies (D3, Angular, and Node JS).
  • Solid understanding in Design Patterns, MVC,PythonAlgorithms,PythonData Structures.
  • Extensive experience in working with different databases such as Oracle, IBM DB, RDBMS, SQL Server, NoSQL, MySQL and writing Stored Procedures, Functions, Joins and Triggers for different Data Models.
  • Experience with Agile Methodology, Scrum Methodology, software version control and release management
  • Handled several techno-functional responsibilities including estimates, identifying functional and technical gaps, requirements gathering, designing solutions, development, developing documentation, and production support
  • An individual with excellent interpersonal and communication skills, strong business acumen, creative problem solving skills, technical competency, team-player spirit, and leadership skills

TECHNICAL SKILLS

Database: Teradata, DB2, MySQL, Oracle, MS SQL Server, IMS/DB, NoSQL

Languages: Java, PIG Latin, SQL, HiveQL, Scala, Shell Scripting, Python

API’s/Tools: Mahout, Eclipse, Log4j, Maven

Web Technologies: HTML, XML, JavaScript

BigData Ecosystem: HDFS, PIG, MAPREDUCE, HIVE, SQOOP, FLUME, OOZIE, HBase, Mongodb, AWSSolr search, Impala, Cassandra, Storm, Flume, Spark

Operating System: Unix, Linux, Windows XP, IBM Z/OS

BI Tools: Tableau, Pentaho

PROFESSIONAL EXPERIENCE

Confidential, San Jose, CA

Sr. Hadoop Developer

Responsibilities:

  • Worked on RESTFUL client/server to get the data from java Api services and vice-versa for the application
  • Used Spring MVC framework to develop IOT web application.
  • Extensively involved in writing ETL Specifications for Development and conversion projects.
  • Loading data from diff servers to s3 bucket and setting appropriate bucket permissions.
  • Develop Simple to complex MapReduce Jobs using Hive, Pig, Java Map reduce and spark using Scala
  • Apache Kafka to transform live streaming with the batch processing to generate reports
  • Worked on Apache Spark for data lake creation for building RWI (Real World Intelligence) application
  • Oracle Golden Gate to ingest data into hdfs and Oracle ODI to schedule jobs, to generate Oracle Opower meter reports
  • Cassandra data modelling for storing and transformation in spark using datastax connector.
  • Worked with XSD and XML files generation through ETL process.
  • Performed the performance evaluation of the ETL for full load cycle.
  • Involved in preparing ETL mapping specification documents and Transformation rules for the mapping
  • Installed Apache Solr cloud on cluster and configured it with Zookeeper
  • Worked on Apache SolrCloud to index documents using hive-solr storage handler to import different datasets including xml, csv, and json
  • Using Python coding and JSON.
  • Created and maintained Technical documentation of Hadoop clusters
  • Used Kerberos security for authentication of users and ssl/tls for encryption of data and sentry for role based access to DB CDH 5.4: Hadoop Cluster

Environment: Hadoop, Amazon Cloud Service like Ec2, S3 and glacier, HDFS, Spark, Pig, Hive MapReduce, shell scripting, RESTFUL web services, spark and Big Data, Python

Confidential, Nashville, TN

Big Data Consultant

Responsibilities:

  • Analyzed large data sets by running custom map reduce, Hive queries and Pig scripts
  • Complex pig udf for business transformations
  • Worked with the Data Science team, Teradata team, and business to gather requirements for various data sources like web scrapes, APIs
  • Involved in creating Hive/Impala tables, and loading and analysing data using hive queries
  • Involved in running Hadoop jobs for processing millions of records and compression techniques
  • Developed multiple MapReduce jobs in java for data cleaning and pre-processing
  • Involved in loading data from LINUX file system to HDFS, and wrote shell scripts for productionizing the MAP (Member Analytics Platform) project and automated using cronacle scheduler
  • Load and transform large sets of structured, and semi structured data.
  • Used ETL (Informatica) to load data from source to Data Warehouse.
  • Involved in writing shell scripts on Unix (AIX) for Informatica ETL tool to run the Sessions.
  • Loaded Golden collection to Apache Solr using morphline code for Business team
  • Assisted in exporting analysed data to relational databases using Sqoop
  • Used Python based Framework such as Django and Flask.
  • Data Modelled for Hbase for large transaction sales data
  • Proof of Concept on Strom for streaming the data from one of the sources
  • Proof of Concept in Pentaho for Big Data
  • Implementation of one of the data source transformations in spark using scala.
  • Cassandra Data Model and design to connect with spark
  • Teradata Fast Export and Parallel Transporter utilities and Sqoop to extract data and load to Hadoop
  • Worked in Agile methodology and used Ice Scrum for Development and tracking the project
  • Worked on Git hub repository, branching, merging, etc
Environment: Hadoop, HDFS, Pig, Hive, IMPALA, Solr, morphline, MapReduce, Sqoop, HBase, shell, Pentaho, spark, Scala, Teradata Parallel load and fast export utility and parallel transporter, github, storm, spark, CDH5.0, HDP and Big Data, Python

Confidential, Sunnyvale, CA

Big Data Engineer

Responsibilities:

  • Worked on analyzing Hadoop cluster using different big data analytic tools including Hive, and MapReduce
  • Worked on debugging, performance tuning of Hive Jobs
  • Migrating tables from RC format to ORC and data induction and other customized file formats.
  • Wrote Autosys jobs to schedule the reports
  • Implemented test scripts to support test driven development and continuous integration
  • Involved in loading data from LINUX file system to HDFS.
  • Extensively used ETL Informatica tool to extract data stored in MS SQL 2000, Excel, and Flat files and finally loaded into a single Data Warehouse.
  • Involved in preparing ETL mapping specification documents and Transformation rules for the mapping.
  • Supported MapReduce Programs those are running on the cluster
  • Gained experience in managing and reviewing Hadoop log files
  • Involved in scheduling Oozie workflow engine to run multiple Hive jobs
  • Proof of concept on spark for interaction source transformations
  • Used Apache solr to search for specific products each cycle for the business
  • Worked on Kafka for live streaming of data
  • Designing and documenting the project use cases, writing test cases, leading offshore team, and interacting with client.

Environment: Hadoop, HDFS, Hive, MapReduce, Oozie, Autosys, shell, Big Data, storm, Kafka, flume, HDP 2.X and spark, Python

Confidential, Washington, DC

Sr. Java Developer

Responsibilities:

  • Developed web components using JSP, Servlets and JDBC
  • Designed tables and indexes
  • Created Design specification using UML Class Diagrams, Sequence & Activity Diagrams
  • Developed the Web Application using MVC Architecture, Java, JSP, and Servlets & Oracle Database.
  • Developed various Java classes, SQL queries and procedures to retrieve and manipulate the data from backend Oracle database using JDBC
  • Extensively worked with Java Script for front-end validations.
  • Designed, Implemented, Tested and Deployed Enterprise Java Beans both Session and Entity using WebLogic as Application Server
  • Developed stored procedures, packages and database triggers to enforce data integrity. Performed data analysis and created crystal reports for user requirements
  • Provided quick turn around and resolving issues within the SLA.
  • Implemented the presentation layer with HTML, XHTML and JavaScript
  • Used EJBs to develop business logic and coded reusable components in Java Beans
  • Development of database interaction code to JDBC API making extensive use of SQL
  • Query Statements and advanced Prepared Statements.
  • Used connection pooling for best optimization using JDBC interface
  • Used EJB entity and session beans to implement business logic and session handling and transactions. Developed user-interface using JSP, Servlets, and JavaScript
  • Wrote complex SQL queries and stored procedures
  • Actively involved in the system testing
  • Prepared the Installation, Customer guide and Configuration document which were delivered to the customer along with the product

Environment: Windows NT 2000/2003, XP, and Windows 7/ 8 C, Java, UNIX, and SQL using TOAD, Finacle Core banking, CRM 10209, Microsoft Office Suit, Microsoft project

Confidential

Java Developer

Responsibilities:

  • Involved in the design and development phases of Rational Unified Process (RUP).
  • Designed Class Diagrams, Sequence Diagrams and Object Diagrams using IBM Rational Rose to model
  • Application was built on MVC architecture with JSP 1.2 acting as presentation layer, Servlets as controller and Developed the application using Jakarta Struts 1.1 Framework: developed action classes, form beans and
  • Used Struts Validation Framework for validating front end forms.
  • Extensively used XML Web Services for transferring/retrieving data between different providers.
  • Developed complete Business tier with Session beans and CMP Entity beans with EJB 2.0 standards using JMS Queue communication in authorization module.
  • Designed and implemented Business Delegate, Session Facade and DTO Design Patterns
  • Involved in implementing the DAO pattern
  • Used JAXB API to bind XML Schema to java classes
  • Used the report generation in the databases written in PL/SQL
  • Used Maven for building the enterprise application modules
  • Used Log4J to monitor the error logs
  • Used JUnit for unit testing
  • Used SVN for Version control
  • Deployed the applications on WebLogic Application Server.

Environment: Struts 1.1, EJB 2.0, Servlets 2.3, JSP 1.2, SQL, XML, XSLT, Web Services, JAXB, SOAP, WSDL, JMS1.1, JavaScript, TDD, JDBC, Oracle 9i, PL/SQL, Log4J, JUnit, WebLogic, Eclipse, Rational XDE, SVN, Linux

We'd love your feedback!