Hadoop Developer/admin Resume
Foster City, CA
SUMMARY
- Over 8+ years of professional IT experience in requirement gathering, design, development, testing, implementation and maintenance. Progressive experience in all phases of the iterative Software Development Life Cycle (SDLC)
- Good Knowledge onHadoopCluster architecture and monitoring the cluster.
- In - depth understanding of Data Structure and Algorithms.
- Experience in managing and reviewingHadooplog files.
- Excellent understanding and knowledge of NOSQL databases like HBase, Cassandra.
- Experience in implementing in setting up standards and processes forHadoopbased application design and implementation.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice versa.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
- Excellent understanding / knowledge ofHadooparchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
- Hands on experience in installing, configuring, and usingHadoopecosystem components like HadoopMapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper and Flume.
- Good Exposure on ApacheHadoopMap Reduce programming, PIG Scripting and Distribute Application and HDFS.
- Experience in coordinating .NET application development from onsite. Version control using TFS and code publish/build to QA and Production environments using MS Visual Studio.
- Experience in managingHadoopclusters using Cloudera Manager Tool.
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Experience in Administering, Installation, configuration, troubleshooting, Security, Backup, Performance Monitoring and Fine-tuning of Linux Redhat.
- Extensive experience working in Oracle, DB2, SQL Server and My SQL database.
- Hands on experience in VPN, Putty, winSCP, VNCviewer, etc.
- Scripting to deploy monitors, checks and critical system admin functions automation.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
TECHNICAL SKILLS
Ecosystem: Big Data HDFS, HBase, Impala,HadoopMapReduce, Zookeeper, Hive, Pig, Sqoop Data Bases Oracle … (SQL & PL/SQL), Sybase ASE 12.5, DB2, MS SQL Server, MySQL
Programming Languages and Scripting: SQL, PL/SQL, C, C++, PHP, Python, Core Java, JavaScript, Shell Script
Web Technologies: HTML, XML, AJAX, SOAP, ODBC, JDBC, Java Beans, EJB, MVC, JSP, Servlets, Java Mail, Struts, Junit Frameworks MVC, Spring, Struts, Hibernate, .NET
Configuration Management Tools: TFS, CVS IDE / Testing Tools Eclipse.
Data warehousing and NoSQL Databases: Netezza, Hbase.
Methodologies: Agile, V-model
Operating System: Windows, UNIX, Linux
Software Products: Putty, Eclipse, Toad 9.1, DB Visualizer, Comptel's AMD 6.0.3 & 4.0.3, InterConnecT v7.1 & 6.0.7, MS Project 2003, HP Quality Center, MS Management studio, MS SharePoint
PROFESSIONAL EXPERIENCE
Confidential, Foster City, CA
Hadoop Developer/Admin
Responsibilities:
- Installed and configuredHadoopMapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Developed workflows using custom MapReduce, Pig, Hive, Sqoop
- Tuned the cluster for optimal performance to process these large data sets
- Built reusable Hive UDF libraries for business requirements which enabled users to use these UDF's in Hive querying
- The logs and semi structured content that are stored on HDFS were preprocessed using PIG and the processed data is imported into Hive warehouse which enabled business analysts to write Hive queries
- Configured big data workflows to run on the top ofHadoopusing Control M and these workflows comprises of heterogeneous jobs like Pig, Hive, Sqoop and MapReduce
- Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library
- Developed workflow in Control M to automate tasks of loading data into HDFS and preprocessing with PIG
- Used Maven extensively for building jar files of MapReduce programs and deployed to Cluster
- Bug fixing and 247production support
Environment: CDH3, PIG(0.8.1), HIVE(0.7.1), Sqoop (V1), Oozie (V2.3.2), Core Java, Oracle 11g, SQL Server 2008, Hbase, ClouderaHadoopDistribution, MapReduce, DataStax, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, LINUX, UNIX Shell Scripting.
Confidential, New York, NY
Hadoop Developer/Admin
Responsibilities:
- Involved in review of functional and nonfunctional requirements.
- Facilitated knowledge transfer sessions.
- Installed and configuredHadoopMap reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in defining job flows.
- Experience in managing and reviewingHadooplog files.
- Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
- Experience in runningHadoopstreaming jobs to process terabytes of xml format data.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Replaced default Derby metadata storage system for Hive with MySQL system.
- Executed queries using Hive and developed MapReduce jobs to analyze data.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF's to preprocess the data for analysis.
- Developed Hive queries for the analysts.
- Involved in loading data from LINUX and UNIX file system to HDFS.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig.
- Developed a custom File System plug in forHadoopso it can access files on Data Platform. This plugin allowsHadoopMapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
- Designed and implemented Mapreduce based large scale parallel relation learning system
- Extracted feeds form social media sites such as Facebook, Twitter using Python scripts.
- Setup and benchmarkedHadoop/HBase clusters for internal use
- SetupHadoopcluster on Amazon EC2 using whirr for POC.
- Wrote recommendation engine using mahout.
Environment: Java, Eclipse, Oracle 10g, Sub Version,Hadoop, Hive, HBase, MapReduce, HDFS, Pig Hive, Cassandra, Java (JDK 1.6),HadoopDistribution of Cloudera, MapReduce, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, LINUX, UNIX Shell Scripting.
Confidential, Albany, NY
Java Developer
Responsibilities:
- Involved in business requirements analysis.
- Built the application using Struts framework with JSP as view part.
- Developed Dispatch Actions, Action Forms and Custom taglibs in Struts framework.
- Designed JSP pages as view in Struts for frontend templates.
- Developed Session Beans for handling the back business requirements.
- Used the RSD IDE for development and Clear Case for the versioning.
- Involved in configuring the resources and administering the Web sphere application server 6.
- Built and deployed the application on Web sphere application server.
- Written stored procedures in DB2.
- Developed code to handle web requests involving Request Handlers, Business Objects, and Data Access
- Objects. Has coded different package structures based on the purpose and security issues handled by that particular package which assistsdevelopersin future enhancements or modifications of code.
- Involved in making the client side validations with JavaScript.
- Involved in code reviews, system integration and testing. Developed unit test cases using JUnit framework.
- Involved in deploying the application on UNIX (DEV, QA and Prod Environments) box.
- Used Change management tool Service Center for promoting the War file from one environment to other.
- Involved in user acceptance testing, fixing bugs and Production support.
Environment: Java Java, J2EE, Apache Struts, Websphere 5 & 6, JNDI, JDBC, JSP, UNIX and Windows NT, DB2 and SQL Server.
Confidential, Shelton, CT
Java Developer
Responsibilities:
- Involved in Analysis, Design, Implementation, and Testing of the project.
- Implemented the presentation layer with HTML, XHTML, JavaScript, and CSS.
- Developed web components using JSP, Servlets and JDBC.
- Implemented database using SQL Server
- Designed Tables and indexes
- Wrote complex TSQL and Stored Procedures.
- Involved in fixing defects and unit testing with test cases using JUnit.
- Developed user and technical documentation.
- Creating of the database tables, writing the queries and stored procedures.
- Coding Java, JSP, and Servlets using the extended Contata Struts framework.
- Used JNI for calling the libraries and other implemented functionality in C language.
- Involved in writing the programs for the XA transaction management on multiple databases of the application.
- Writing stored procedures & functions (TSQL equal to PL/SQL) in the Sql server DB.
- Used the Stax API / JAXP to read / manipulate the xml properties files.
- Review, Deploying.
- Junit Testing.
Environment: Java, JSP, Servlets, JDBC, JavaScript, CSS, MySQL, JUnit, Eclipse, JBoss.