Hadoop Consultant Resume
Bellevue, WA
SUMMARY:
- IT professional over 7+ years of diversified experience with extensive knowledge and background in Software Development lifecycle - Analysis, Design, Development, Debugging and Deploying various software applications. More than 3 years of hands on experience in Big Data and Hadoop Ecosystem in ingestion, storage, querying, processing and analysis using MapReduce, Pig, Hive, and Sqoop, etc. 3 years working experience using JAVA/J2EE technologies.
- Over 3 years of professional work experience in Hadoop (Cloudera Distribution CDH3, 4 and 5, Hortonworks HDP, MapR)
- Worked with clusters with size of data over 120 TB.
- Extensive experience in both MapReduce MRv1 and MapReduce MRv2 (YARN)
- Extensive experience in writing MapReduce programs in Java.
- Extensive experience in HDFS, PIG, Hive, Sqoop, Flume, Oozie, ZooKeeper and HBase.
- Experience with Cloudera CDH3, CDH4 and CDH5 distributions
- Extensive experience with ETL and analyzing Big data using Pig Latin and Hive QL
- Hands on experience in Big Data ingestion tools like Flume and Sqoop.
- Experience with Sequence files, AVRO and HAR file formats and compression.
- Experience in tuning and troubleshooting performance issues in Hadoop cluster.
- Strong experience is working and setting up environments on Amazon AWS EC2 instances.
- Hands on NoSQL database experience with HBase, Cassandra.
- Good working knowledge of Cloudera Impala.
- Experience in designing, sizing and configuring Hadoop environments
- Worked with application teams to install operating system, Hadoop updates, patches and version upgrades as required.
- Expertise with managing and reviewing Hadoop log files.
- Background with traditional databases such as Oracle, SQL Server, MySQL.
- Good understanding of ETL processes and Data warehousing.
- Experience in Data Mapping, Data Modeling and Data Normalization
- Hands on experience on IDE tools like Eclipse, NetBeans, and Visual Studio.
- Experience in designing and coding web applications using Core Java and J2EE Technologies like JSP, Hibernate, spring, Struts, Java Beans, Servlets, EJB, RMI, XML and JDBC
- Experience in developing front-end code using HTML, CSS, JavaScript and Ajax Framework.
- Experience in MVC (Model View Controller) Architecture, using spring, Struts and Hibernate Framework with various Java/J2EE design patterns.
- Experience in writing ANT and Maven scripts to build and deploy Java applications.
- Experience in collecting business requirements, writing functional requirements and test case documents; creating technical design documents with UML - Use Cases, Class, and Sequence and Collaboration diagrams.
- Implemented Unit Testing using JUNIT and Load Runner during the projects.
- Well versed in Object Oriented Programming and Software Development Life Cycle from project definition to post-deployment
- Experience in applying Six Sigma Tools and Techniques for process improvement during different phases of Six Sigma driven projects.
- Up to date on evaluating new analytical tools and projects like Apache Spark and Apache Shark, Datameer, Platfora, etc.
- Deployed, configured and managed Linux Servers in AWS EC2.
- Experience in writing Shell Scripts (bash, SSH, Perl)
- Experience in version control systems (SVN, Github, etc.) including branching and merging strategies.
- Conversant with Web application servers - Tomcat, Websphere, Weblogic and Jboss servers.
- Well-organized and efficient leader with demonstrated ability in project management, client management and strategic planning.
- Refined planning and organizational skills that balance work, team support and ad-hoc responsibilities in a timely and professional manner.
- An individual with excellent interpersonal and communication skills, strong business acumen, creative problem solving skills, technical competency, team-player spirit and leadership skills.
- Ability to effectively communicate with all levels of organization such as technical, management and customers.
TECHNICAL SKILLS:
Big Data/ Hadoop Framework: HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeper, Flume, HBase and Cassandra
Databases: Oracle 9i/10g, Microsoft SQL Server, MySQL
Languages: Java/J2EE, C#, C++, SQL, Pig Latin, Perl
OpenSource Java Framework: Spring, Struts, Hibernate
Office Tools: Microsoft Office Suite
Operating Systems: Windows XP/7, CentOS, Ubuntu
Web Technologies: JSP, Servlets, JavaBeans, JDBC, XML, VMware, Amazon AWS
FrontEnd: HTML/HTML 5, CSS3, JavaScript/JQuery, Ajax, Bootstrap
Development Tools: Eclipse, NetBeans, Visual Studio
Development Methodologies: Six Sigma Development Methodologies, Agile/Scrum, Waterfall
PROFESSIONAL EXPERIENCE:
Confidential, Bellevue, WA
Hadoop Consultant
Responsibilities:
- Worked on a live 65 nodes Hadoop cluster running CDH4.4
- Worked with highly unstructured and semi structured data of 70 TB in size (210 TB with replication factor of 3)
- Extensive experience in writing Pig scripts to transform raw data from several data sources in to forming baseline data.
- Developed Hive scripts for end user / analyst requirements for ad-hoc analysis
- Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
- Worked in tuning Hive and Pig scripts to improve performance.
- Developed UDFs using Java as and when necessary to use in PIG and HIVE queries
- Experience in using Sequence files, AVRO and HAR file formats.
- Extracted the data from Teradata into HDFS using Sqoop.
- Created Sqoop job with incremental load to populate Hive External tables.
- Developed Oozie workflow for scheduling and orchestrating the ETL process
- Very good experience with both MapReduce 1 (Job Tracker) and MapReduce 2 (YARN)
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Good working knowledge of HBase.
- Used AVRO serialization technique to serialize data.
- Wrote Map Reduce jobs based on Java, Pig, Hive, Sqoop/Flume according to need.
- Involved in data integration on ETL Informatica environment.
- Monitored and scheduled the Hadoop jobs.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Extracted feeds form social media sites such as Twitter.
- Configured Hadoop system files to accommodate new sources of data and updated the existing configuration Hadoop cluster
- Involved in loading data from UNIX file system to HDFS.
- Involved in gathering business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Actively participated in the code reviews, meetings and solving any technical issues.
Environment: Java 7, Eclipse, Oracle 10g, Hadoop, MapReduce, Hive, HBase, Oozie, Linux, MapReduce, HDFS, Hive, CDH, SQL, Toad 9.6.
Confidential, Summit, NJ
Hadoop Consultant
Responsibilities:
- Worked on a live Hadoop production CDH3 cluster with 35 nodes
- Worked with highly unstructured and semi structured data of 25 TB in size
- Good experience in benchmarking Hadoop cluster.
- Implemented Flume (Multiplexing) to steam data from upstream pipes in to HDFS
- Used Sqoop to import data from DB2 system in to HDFS
- Worked closely with Enterprise Data Warehouse.
- Worked on custom Map Reduce programs using Java
- Designed and developed PIG data transformation scripts to work against unstructured data from various data points and created a base line.
- Worked on creating and optimizing Hive scripts for data analysts based on the requirements.
- Created Hive UDFs to encapsulate complex and reusable logic for the end users.
- Very good experience in working with Sequence files and compressed file formats.
- Worked with performance issues and tuning the Pig and Hive scripts.
- Exported the result set from HIVE to MySQL using Shell Scripts.
- Good experience in troubleshooting performance issues and tuning Hadoop cluster.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Good experience in setting up and configure clusters in AWS
- Worked with the infrastructure and the admin teams to set up monitoring probes to track the health of the nodes
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts.
- Wrote Hive Queries and UDFs.
- Wrote efficiently with components of the hadoop ecosystem like Hive, Pig, Impala, Oozie, Sqoop & Zoopkeeper.
- Monitored system health and logs and respond accordingly to any warning of failure conditions.
- Maintained System integrity of all sub-components (primarily HDFS, MR, HBase and Flume)
- Used SVN for version controlling and configuration management.
Environment: Java 7, Eclipse, Oracle 10g, Hadoop, Hive, HBase, Oozie, Linux, MapReduce, HDFS, Hive, CDH, SQL, Toad 9.6, Teradata, EC2, Spark, UNIX Shell Scripting.
Confidential, North Brook, IL
Java/Hadoop Developer
Responsibilities:
- Worked on a live Hadoop production CDH3 cluster with 50 nodes
- Worked with highly unstructured and semi structured data of 40 TB in size
- Analyzed Hadoop clusters, other analytical tools used in big data like Hive, Pig and databases like HBase.
- Used Sqoop extensively to ingest data from various source systems into HDFS.
- Written Hive queries for data analysis to meet the business requirements
- Created Hive tables and worked on them using Hive QL.
- Installed cluster, worked on commissioning & decommissioning of Datanode, Namenode recovery, capacity planning, and slots configuration
- Assisted in managing and reviewing Hadoop log files
- Assisted in loading large sets of data (Structure, Semi Structured, Unstructured)
- Implemented Hadoop cluster on Ubuntu Linux.
- Installed and configured Flume, Sqoop, Pig, Hive, HBase on Hadoop clusters.
- Managed Hadoop clusters include adding and removing cluster nodes for maintenance and capacity needs.
- Wrote test cases in JUnit for unit testing of classes.
- Developed the application on Eclipse.
- Wrote MapReduce jobs uning Java.
- Involved in templates and screens in HTML and JavaScript.
- Involved in JMS Connection Pool and the implementation of publish and subscribe using spring JMS.
- Used JMS template to publish and Message Driven POJO (MDP) to subscribe from JMS provider.
- Used Hibernate, object/relational-mapping (ORM) solution, technique of mapping data representation from MVC model to Oracle Relational data model with a SQL based schema.
- Developed ANT Scripts for the build process.
- Developed SQL quesries and Stored Procedures using PL/SQL to retrieve and insert into multiple database schemas.
- Performed Unit Testing JUnit and Load testing using LoadRunner.
- Implemented Log4J to trace logs and to track information.
Environment: Java 7, Eclipse, Oracle 10g, Hadoop, Hive, HBase, Flume, Sqoop, Linux, MapReduce, HDFS, Hive, CDH, SQL, Toad 9.6.
Confidential, Charlotte, NC
Java Developer
Responsibilities:
- Involved in complete software development life cycle with Object Oriented approach.
- Interacted with various IT departments (Internal Users) for gathering requirements.
- Responsible for analysis and design, creating use case diagrams, sequence diagrams, and preliminary class diagrams for the system.
- Responsible for modifying the functionalities of some major modules of the application.
- Involved in development of controller component using Servlets and view component using JSP, HTML and JavaScript for the client side validation.
- Developed the login Servlets that is responsible for initial authentication of the users.
- Involved in the enhancement of the existing front end GUI.
- Wrote various Struts Action Forms, Struts Action classes, and exception handling.
- Used Hibernate as ORM to map Java classes to database tables.
- Created Use Cases diagrams, activity diagrams, sequence diagrams and class diagrams
- Responsible for writing JavaScript for the validation in client side.
- Used Struts framework to implement the MVC architecture.
- Worked closely with requirements to translate business rules into business component modules
- Involved in coding of JSP pages for the presentation of data on the View layer in MVC architecture.
- Developed EJB’s in Web Logic for handling business process, database access and asynchronous messaging.
- Wrote Stored Procedures and Triggers using PL/SQL.
- Involved in building and parsing XML documents using SAX parser after retrieving member history data from the database.
Environment: Java, JSP, JSTL, EJB, JMS, JavaScript, JSF, XML, JBOSS, WebSphere, WebLogic, Hibernate, Spring, SQL, PL/SQL, CSS, Log4j, JUnit, SVN, Eclipse, Oracle 11g, LoadRunner, ANT
Confidential, Paramus, NJ
Java Developer
Responsibilities:
- Involved in complete SDLC with Object Oriented approach.
- Utilized Agile Methodologies to manage full life-cycle development of the project.
- Implemented MVC design pattern using Struts Framework.
- Form classes of Struts Framework to write the routing logic and to call different services.
- Created tile definitions, Struts-configuration files, validation files and resource bundles for all modules using Struts framework.
- Developed web application using JSP custom tag libraries, Struts Action classes and Action.
- Designed Java Servlets and Objects using J2EE standards.
- Used JSP for presentation layer, developed high performance object/relational persistence and query service for entire application utilizing Hibernate.
- Developed the XML Schema and Web services for the data maintenance and structures.
- Used Confidential Internal framework to parse the data as per XML Definition and transform the data using XML format.
- Used router provided by the framework to route the data to external system based on BRE’s direction
- Created XML Definition of different kinds of scan message types.
- Used Web Sphere Application Server to develop and deploy the application.
- Worked with various Style Sheets like Cascading Style Sheets (CSS).
- Involved in coding for JUnit Test cases and integration testing.
- Designed and developed test plans and test cases.
Environment: EJB, Struts, XML, UML, DB2, Weblogic, Eclipse, Ant, JUnit, Log4j, JavaScript,PVCS, BRE (Business Rule Engine-IBM)
Confidential, Pleasanton, CA
Java/J2EE Developer
Responsibilities:
- Responsible for gathering all required information and requirements for the project.
- Experience inAgile Programmingand accomplishing the tasks to meet deadlines.
- Used Ajax and JavaScript to handle asynchronous request, CSS to handle look and feel of the application.
- Involved in design ofClassDiagrams, SequenceDiagramsandEventDiagramsas a part of Documentation.
- Developed the presentation layer using CSS and HTMLtaken from Bootstrap to develop for multiple browsers including mobiles and tablets.
- Extended standard action classes provided by theStruts frameworkfor appropriately handling client requests.
- Monitored and scheduled the UNIX scripting jobs.
- Designed, developed and did maintenance of data integration programs in RDBMS environment with traditional source systems.
- Experienced working on ETL/Data Warehousing environment (Informatica Powecenter)
- ConfiguredStruts tilesfor reusing view components as an application of J2EE composite pattern.
- Involved in the integration of Struts and Spring 2.0 for implementing Dependency Injection (DI/IoC). Developed code for obtaining bean s inSpringIoC framework.
- Developed the application onEclipse.
- Involved in the implementation of beans inApplication.
- Migrated ETL Informatica code by using team based versioning.
- Hands on experience in web services, distributed computing, multi-threading, JMS etc.
- Implementedcross cuttingconcerns as aspects at Service layer usingSpring AOP.
- Involved in the implementation of DAO objects using spring - ORM.
- Involved in the JMS Connection Pool and the implementation of publish and subscribe usingSpring JMS. Used JMS Template to publish andMessage Driven POJO (MDP)to subscribe from theJMSprovider.
- Involved in creating theHibernate POJO’s and developedHibernate mapping Files.
- UsedHibernate, object/relational-mapping (ORM) solution, technique of mapping data representation from MVC model to Oracle Relational data model with a SQL-based schema.
- DevelopedSQL queriesandStored ProceduresusingPL/SQLto retrieve and insert into multiple database schemas.
- Developed Ant Scripts for the build process.
- Version Controlmandated throughSVN (Subversion).
- Performed Unit Testing UsingJUnit andLoad testing usingLoadRunner.
- ImplementedLog4Jto trace logs and to track information.
Environment: Java, Struts, JSP, JSTL, JSON, JavaScript, JSF, POJO's, Hibernate, spring, Teradata, PL/SQL, CSS, Log4j, JUnit, Subversion, Informatica, Eclipse, Netezza, Jenkins, Git, Oracle 11g, LoadRunner, ANT.
Confidential, Richmond, VA
Software Developer
Responsibilities:
- Created a design document by using architecture diagrams, flow diagrams, class diagrams and sequence diagrams.
- Performed requirement analysis and improvement using JAD sessions
- Designed and Developed a WSDL along with service point in accordance to the requirements
- Developed Stateless Session Beans (EJBs) for exposing the methods as Web Services.
- Used Weblogic Webservices to generate WebServices Stubs and Client
- Used JAXB for marshalling and un-marshalling the response and request XML’s from Web Services for handling faults and login authentication.
- Used SOAPUI, and Weblogic Test Client for sending the request xml’s and getting the response xml’s.
- Coordinated with DBA for developing SQL queries for the application
- Developed build .xml using Apache Ant to automate the software build process
- Developed detail design using the design patterns -- Session Facade, Business Delegate, DAO and DTO.
- Used Spring Framework for dependency injection
- Used Oracle for data storage.
- Developed MVC web interface using Spring MVC
- Used Hibernate for data persistence
- Developed complicated HQL queries
- Created HTML Files using XSLT
- Integrated LDAP to WebServices and WebModule.
- Developed Background Processes suitable for clustered server farm to have Email and FTP Service
- Developed JSP for the web interface
Environment: Java, J2EE, JDK 1.6, RUP, UML, JAD, SQL Developer, Weblogic Application Server 11.1, Spring, EJB, Struts, Hibernate, Oracle 11g, PL/SQL, XML, XML Schemas, SVN, JSP, XML SPY, ANT, XSLT, Windows 7.