Hadoop Developer Resume
Columbus, OH
PROFESSIONAL SUMMARY:
- Having 7+ Years of progressive experience in complete project life cycle of the software development process including requirement gathering, design, development, testing, Implementation and maintenance usingSQL, Java/J2EE and Big data technologies.
- Around 3years of work experience as Hadoop Developer with good knowledge of Hadoop framework, Hadoop Distributed file system, Parallel processing architecture and Data Analytics.
- Hands on experience in major Big Data components likeHDFS, Map Reduce, Hive, Pig, Sqoop, Oozie, Flume, Zoo keeper,Avro and No SQL databases likeHBase and Cassandra.
- Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP.
- Hands on Experience in writing complex Map Reduce programs to perform analytics based on different common patterns including Joins, Sampling,Data Organization, filtering and Summarization
- Hands on experience in writing Map Reduce programs using Java and Python
- Hands on experience creating Hive tables and working on them using Hive QL and written Hive queries for data analysis to meet business requirements.
- Experience in writing Custom UDF’s like UDAF’s and UDTF’s for extending Hive and Pig core functionality
- Hands on experience in writing Pig scripts using Pig Latin to perform ETL operations.
- Hands on experience in performing real time analytics on big data using HBase and Cassandra.
- Experience in using Flume to stream data into HDFS.
- Experience with Oozie Workflow engine in running workflow jobs with actions that run Hadoop MapReduce and Pig jobs
- Good practical understanding of cloud infrastructure like Amazon Web Services (AWS)
- Experienced with configuring, monitoring large clusters using different distributions like Cloudera, Horton work.
- Monitored multiple Hadoop clusters environments using Cloudera Manager and Ganglia
- Experience in Software Development Life Cycle (Requirements Analysis, Design, Development, Testing, Deployment and Support).
- Extensive experience in middle - tier development using J2EE technologies like JDBC, JNDI, JSP, Servlets,JSP, JSF, Struts, Spring, Hibernate, JDBC, EJB.
- Experienced with working in SOA architecture by implementing SOAP /Rest Web Services that will integrate with multiple applications.
- Experience with web-based UI development using jQuery UI, jQuery, CSS, HTML, HTML5, XHTML and Javascript.
- Experience in using IDEs like Eclipse, Net Beans
- Experience with build tools like Maven and Ant.
- Development experience in DBMS like Oracle, MS SQL Server, Teradata and MYSQL.
- Developed stored procedures and queries using PL/SQL.
- Expertise in RDBMS like Oracle, MS SQL Server, MySQL and DB2.
- Support development, testing, and operations teams during new system deployments.
- Evaluate and propose new tools and technologies to meet the needs of the organization.
- An excellent team player and self-starter with good communication skills and proven abilities to finish tasks before target deadlines.
WORK EXPERIENCE:
Confidential, Columbus, OH
Hadoop Developer
Responsibilities:
- Involved in Cluster Setup, monitoring and adminstration tasks like commison, decommison nodes, assigning Quotas .
- Experienced with running multiple jobs workflow using OOzie Client API and Java schedulers.
- Implemented Sentiment Analytics tool that peroform analysis on large xml files using Map Reduce programs.
- Experienced with handling Avro data files, Json files using Avro Data Serailization system.
- Experienced with optimizing sort & shuffle phase in map reduce frame work.
- Implemented custom counters to save log information to external system.
- Experienced with Implementing Map Side, Reduce side and optimized join base implementation.
- Involved in creatingHivetables, partitions and loadingwith data andwritinghivequerieswhichwill run internally in map reduce way.
- Implemented anaylytical platform thas used HiveQL functions and different kind of join operations like Map joins, Bucketed Map joins.
- Developed Hive UDFs for rating aggregation
- Experienced with optimzing Hive Queries, Joins and hands on experience with Hive peroformence tunning.
- Used Oozie tool for job scheduling
- Developed Hbase Java client API for CRUD Operations
- Writing Unit Test Cases using MRUnit, Junit and Easy Mock.
- Importing and exporting data into HDFS and Hive using Sqoop. Involved in loading data from UNIX file system to HDFS.
- Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanismsLZO,snappy.
- Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, loaded data into HDFS and extracted the data into HDFS using Sqoop.
- Experienced in Managing, Reviewing log files using Web UI and Cloudera manager.
Environment: Hadoop, Map Reduce, HDFS, Hive, Sqoop, HiveQL, Oozie, Avro Data Serialization,Cloudera, Java, My SQL, SQL, Unix,Eclipse, maven, Junit, Jenkins
Confidential, Concord, NH
Hadoop Developer
Responsibilities:
- Loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
- Experienced with performing analytics on Time Series data using HBase and Java API.
- Supported Map Reduce programs running on the cluster and wrote MapReduce jobs using Java API
- Implemented POC in HBASE to decide between Tall vs Narrow tables designdecisions.
- Experienced with accessing HBase using different client API's like Thrift, Java and Rest API.
- Integrated HBasewith Map Reduce to move bulk amount of data into HBase.
- Implemented Hive UDF's to validate against business rules before data move to Hive tables.
- Experienced with Accessing Hive tables to perform analytics from java applications using JDBC.
- Experienced with join different data sets using Pig join operations to perform queries using pig scripts.
- Experienced with Pig Latin operations and writing Pig UDF's to perform analytics.
- Implemented Unix shell scripts to perform cluster admin operations.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Configured Flume to extract the data from the web server output files to load into HDFS.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Experienced with monitoring, debugging cluster using Ganglia.
Environment: Horton works, HBase,MapReduce, HDFS,HBase, Hive, Pig, Oozie, Sqoop, Flume, Ganglia,Oracle 10g
Confidential, San Francisco, CA
Java/Hadoop Developer
Responsibilities:
- Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Migrated existing SQL queries to HiveQL queries to move to big data analytical platform.
- Integrated Cassandra file system to Hadoop using Map Reduce to perform analytics on Cassandra data.
- Implemented Real time analytics on Cassandra data using thrift api.
- Responsible to manage data coming from different sources.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration.
- Load and transform large sets datainto HDFS using hadoop fs commands.
- Supported in setting up updating configurations for implementing scripts with Pig and Sqoop.
- Designed the logical and physical data modeland wrote DML scripts for Oracle 9i database
- Used Hibernate ORM framework with Spring framework for data persistence
- Wrote test cases in Junit for unit testing of classes
- Involved in templates and screens in HTML and JavaScript
Environment: Java, HDFS, Cassandra, Map Reduce, Sqoop, Junit, HTML, JavaScript, Hibernate, Spring, Pig.
Confidential, Boston, MA
SQL/Java developer
Responsibilities:
- Involved in complete requirement analysis, design, coding and testing phases of the project.
- Implemented the project according to the Software Development Life Cycle (SDLC).
- Developed JavaScriptbehavior code for user interaction.
- Used HTML, JavaScript, and JSP and developed UI.
- Used JDBC and managed connectivity, for inserting/querying& data management including stored procedures and triggers.
- Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Sql Server database.
- Part of a team, which is responsible for metadata maintenance and synchronization of data from database.
- Involved in the design and coding of the data capture templates, presentation and component templates.
- Developed an API to write XML documents from database.
- Used JavaScript and designed user-interface and checking validations.
- Developed Junit test cases and validated users input using regular expressions in JavaScript as well as in the server side.
- Developed complex SQL stored procedures, functions and triggers.
- Mapped business objects to database using Hibernate.
- Wrote SQL queries, stored procedures and database triggers as required on the database objects.
Environment: Java, Spring, XML, Hibernate, SQL Server, Junit, JSP, JavaScript.
Confidential
SQL Server Developer
Responsibilities:
- Actively involved in different stages of Project Life Cycle.
- Documented the data flow and the relationships between various entities.
- Actively participated in gathering of User Requirement and System Specification.
- Created new Database logical and Physical Design to fit the new business requirement and implemented the same using SQL Server.
- Created Clustered and Non-Clustered Indexes for improved performance.
- Created Tables, Views and Indexes on the Database, Roles andmaintained Database Users.
- Followed and maintained standards and best practices in database development.
- Provided assistance to development teams on Tuning Data, Indexes and Queries.
- Developed new Stored Procedures, Functions, and Triggers.
- Implemented Backup and Recovery of the databases.
- Actively participated in User Acceptance Testing, and Debugging of the system.
Environment: SQL, PL/SQL, Windows 2000/XP, MS SQL Server 2000, IIS.