Hadoop Developer Resume
Oakland, CA
SUMMARY
- Six Sigma (Black Belt) and PMP certified IT professional over 10 years experience with extensive knowledge and background in Software Development lifecycle - Analysis, Design, Development, Debugging and Deploying various software applications. More than 3 years hands on experience in BigData and Hadoop Ecosystem including MapReduce, Pig, Hive, and Sqoop. 7+ years working experience using JAVA/J2EE & C# technologies.
- 10+ years of professional experience in IT industry, with 3+ years of experience in Hadoop ecosystem’s implementation, maintenance, ETL/Informatica and Big Data analysis operations.
- Excellent understanding of Hadoop architecture and underlying framework including storage management.
- Experience in using various Hadoop infrastructures such as MapReduce, Pig, Hive, ZooKeeper, HBase, Sqoop, Oozie, Flume and SOLR for data storage and analysis.
- Experience in developing custom UDFs for Pig and Hive to incorporate methods and functionality of Python/Java into PigLatin and HQL (HiveQL).
- Experience with Oozie Scheduler in setting up workflow jobs with Map/Reduce and Pig jobs.
- Knowledge of architecture and functionality of NOSQL DB like HBase, Cassandra and MongoDB.
- Experience in managing Hadoop clusters and services using Cloudera Manager.
- Experience in troubleshooting errors in HBase Shell/API, Pig, Hive and MapReduce.
- Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
- Collected logs data from various sources and integrated in to HDFS using Flume.
- Assisted Deployment team in setting up Hadoop cluster and services.
- Hands-on experience in setting up Apache Hadoop and Cloudera CDH clusters on Ubuntu, Fedora and Windows (Cygwin) environments.
- In-depth knowledge of modifications required in static IP (interfaces), hosts and bashrc files, setting up password-less SSH and Hadoop configuration for Cluster setup and maintenance.
- Excellent understanding of Virtualization, with experience of setting up a POC multi-node virtual cluster by leveraging underlying Bridge Networking and NAT technologies.
- Experience in loading data to HDFS from UNIX (Ubuntu, Fedora, Centos) file system.
- Knowledge of project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Experience in writing batch scripts in Ubuntu/UNIX to automate sequential script entry.
- Knowledge of Hardware, Software, Networking and external tools including but not limited to Excel, Access and experience in utilizing their functionality as and when required to enhance productivity and ensure accuracy.
- Well versed in Object Oriented Programming and Software Development Life Cycle from project definition to post-deployment
- Experience applying six sigma tools and techniques for process improvement during different phases of Six Sigma driven projects.
- An individual with excellent interpersonal and communication skills, strong business acumen, creative problem solving skills, technical competency, team-player spirit and leadership skills.
TECHNICAL SKILLS
BigData/ Hadoop Framework: HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Flume and HBase
Databases: Oracle 9i/10g, Microsoft SQL Server, MySQL
Languages: Java/J2EE, C#, SQL, Python, Uinux scripting
Office Tools: Microsoft Office Suite
Operating Systems: Windows XP/8, CentOS, Ubuntu
Development Tools: Eclipse, NetBeans, Visual Studio
Development Methodologies: Six sigma Development methodologies, Agile/Scrum, Waterfall
PROFESSIONAL EXPERIENCE
Confidential, Oakland, CA
Hadoop Developer
Responsibilities:
- Good understanding and related experience with Hadoop stack-internals, Hive, Pig and Map/Reduce.
- Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.
- Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Python for data cleaning and pre-processing.
- Involved in loading data from LINUX file system to HDFS.
- Wrote MapReduce jobs to discover trends in data usage by users.
- Involved in managing and reviewing Hadoop log files.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Loaded and transformed large sets of structured, semi structured and unstructured data.
- Wrote pig UDF’s.
- Developed HIVE queries for the analysts.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Exported the result set from HIVE to MySQL using Shell scripts.
- Used Zookeeper for various types of centralized configurations.
- Involved in maintaining various LINUX Shell scripts.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Automated all the jobs starting from pulling the Data from different Data Sources like MySQL to pushing the result set Data to Hadoop Distributed File System using Sqoop.
- Involved in steaming data using Spark.
- Used AWS for cluster management and helped the team to increase Cluster from 25 Nodes to 40 Nodes.
- Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Flume).
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
Environment: Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, Spark, ETL/Informatica, Python, LINUX Shell Scripting.
Confidential, Plymouth, MN
Hadoop Developer
Responsibilities:
- Handled importing of data from various data sources, performed transformations using Hive, PIG, and loaded data into HDFS.
- Experienced in Importing and exporting data into HDFS and Hive using Sqoop.
- Loaded and transformed large sets of structured, semi structured and unstructured data.
- Responsible for managing data coming from different sources.
- Gained good experience with NoSQL database such as Cassandra and MongoDB.
- Involved in creating Hive tables, loading with data and writing Hive queries, which will run internally in map, reduce way.
- Involved in creating tables, partitioning, bucketing of table.
- Good understanding and related experience with Hadoop stack-internals, Hive, Pig and Map/Reduce.
Environment: Core Java, MS Excel 2007, Oracle, Apache Hadoop, Pig, Hive, Map-reduce, Sqoop, JAVA/J2EE, WINDOWS.
Confidential, Louisville, KY
Big Data Hadoop Developer
Responsibilities:
- Responsible for writing MapReduce programs.
- Logical Implementation and interaction with HBase.
- Used Fast script alternative to scoop to automate transfer of data from oracle to HBase.
- Performed data analysis using Hive and Pig.
- Loaded log data into HDFS using Flume.
- Assisted with the addition of Hadoop processing to the IT infrastructure.
- Gained knowledge on creating strategies on risky transactions.
- Successfully loaded files to Hive and HDFS from MongoDB.
- Knowledgeable about reading from Cassandra and also writing data to it.
- Responsible for modification of API packages.
- Accessed massive volumes of client data.
- Prepared Multi-cluster test harness to exercise the system for performance and failover.
- Developed high-performance cache, making the site stable and improving its performance.
- Worked on Cloudera to analyze data present on top of HDFS.
Environment: Hadoop, MapReduce, HDFS, Hive, Java (JDK 1.6), Hadoop Distribution of HortonWorks, Cloudera, MapR, IBM DataStage 8.1 (Designer, Director, Administrator), MongoDB, Flat Files, Oracle 11g 10g, PL SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting and Pig scripts.
Confidential, Dallas, TX
Java Developer
Responsibilities:
- Developed the system by following the agile methodology.
- Involved in the implementation of design using vital phases of the Software development life cycle (SDLC) that includes Development, Testing, Implementation and Maintenance Support.
- Applied OOAD principles for the analysis and design of the system.
- Used WebSphere Application Server to deploy the build.
- Developed front-end screens using JSP, HTML, JQuery, JavaScript and CSS.
- Used Spring Framework for developing business objects.
- Used Eclipse for the Development, Testing and Debugging of the application.
- Log4j framework has been used for logging debug, info & error data.
- Used Oracle 10g Database for data persistence.
- SQL Developer was used as a database client.
- Used WinSCP to transfer file from local system to other system.
- Performed Test Driven Development (TDD) using JUnit.
- Used Ant script for build automation.
- Used Rational ClearQuest for defect logging and issue tracking.
- Prepared design documents using object oriented technologies.
- Involved in analyzing the requirements, drafted use cases and created UML class and sequence diagrams.
- Used APIs of Java SQL and Swing packages extensively.
- Involved in the front-end dynamic screens using Swing, JSP, JavaScript, HTML, DHTML and CSS.
- Used Java and JDBC APIs for database access. Has written SQL queries.
- Involved in the development of Servlets.
- Developed and deployed Servlets and JSPs on Tomcat web server.
- Configured web.xml to map incoming requests to servlets
- Developed an N-Tier email message center application with the feature to filter and read the emails, archiving emails in the message center and feature to compose new emails.
- Worked on the interface to provide a feature for the accounts department to keep track of their publishers and advertiser's statistics and means to receive and make payments to and from their clients.
Environment: Windows XP, Unix, Java5.0, Design Patterns, Web sphere, Apache Ant, J2EE (Servlets, JSP), HTML, JSON, JavaScript, CSS, Eclipse, SQL Developer, JUnit.
Confidential, Newton, MA
Java Developer
Responsibilities:
- Participated in requirement analysis and design.
- Extracted the Use Cases based on business requirements and was involved in creation of Class Diagrams, Object Interaction Diagrams (Sequence and Process) and Activity Diagrams
- Involved in Development, Debugging, and Unit Testing from end-to-end of Product.
- Developed Unit Test Cases and performed unit testing to verify the functionalities.
- Wrote JSP pages and Java Script validations.
- UI development by using XSLT, XML and java script.
- Worked in different releases and patch versions of the Product.
- Developed action classes and JSP pages.
- Maintained code repository and versioning using SVN
- Logged errors and fixed bugs at different stages of development process.
- Actively involved in System Integration Testing (SIT) and User Acceptance Testing (UAT).
Environment: Java/J2EE components, Web services, Java script, Oracle 9i, DAO, XML, XSLT, IBM Web sphere 6.1, Ant Tool, SVN, Rational Rose, Clear Quest.
Confidential - Parsippany, NJ
Java Developer
Responsibilities:
- Utilized Agile Methodologies to manage full life-cycle development of the project.
- Implemented MVC design pattern using Struts Framework.
- Formed classes of Struts Framework to write the routing logic and to call different services.
- Created tile definitions, Struts-config files, validation files and resource bundles for all modules using Struts framework.
- Developed web application using JSP custom tag libraries, Struts Action classes and Action.
- Designed Java Servlets and Objects using J2EE standards.
- Used JSP for presentation layer, developed high performance object/relational persistence and query service for entire application utilizing Hibernate.
- Developed the XML Schema and Web services for the data maintenance and structures.
- Used Web Sphere Application Server to develop and deploy the application.
- Worked with various Style Sheets like Cascading Style Sheets (CSS).
- Involved in coding for JUnit Test cases.
Environment: Java/J2EE, Oracle 11g, SQL, JSP, Struts 1.2, Hibernate 3, Web Logic 10.0, HTML, AJAX, Java Script, JDBC, XML, JMS, UML, JUnit, log4j, Web Sphere, My Eclipse