We provide IT Staff Augmentation Services!

Hadoop Administrator  Resume

2.00/5 (Submit Your Rating)

New York New, YorK

PROFESSIONAL SUMMARY:

  • Over 9 years of IT experience as a Developer, Designer & quality Tester with cross platform integration experience using Hadoop Ecosystem, Java and Software Functional Testing.
  • Hands on experience in installing, configuring and using Hadoop Ecosystem - HDFS, MapReduce, Pig, Hive, Oozie, Flume, Hbase, Spark, Sqoop, Flume and Oozie.
  • Hands on experience using Cloudera and Hortonworks Hadoop Distributions.
  • Strong understanding of various Hadoop services, MapReduce and YARN architecture.
  • Responsible for writing Map Reduce programs.
  • Experienced in importing-exporting data into HDFS using SQOOP.
  • Experience loading data to Hive partitions and creating buckets in Hive.
  • Developed Map Reduce jobs to automate transfer the data from HBase.
  • Expertise in analysis using PIG, HIVE and MapReduce.
  • Experienced in developing UDFs for Hive,PIG using Java.
  • Strong understanding of NoSQL databases like HBase, MongoDB & Cassandra.
  • Scheduling all hadoop/hive/sqoop/Hbase jobs using Oozie.
  • Experience in setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
  • Good understanding of Scrum methodologies, Test Driven Development and continuous integration. 
  • Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
  • Experience in defining detailed application software test plans, including organization, participant, schedule, test and application coverage scope.
  • Experience in gathering and defining functional and user interface requirements for software applications.
  • Experience in real time analytics with Apache Spark (RDD, DataFrames and Streaming API). 
  • Used Spark DataFrames API over Cloudera platform to perform analytics on Hive data.
  • Experience in integrating Hadoop with Kafka. Expertise in uploading Click stream data from Kafka to HDFS.
  • Expert in utilizing kafka for messaging and publishing subscribe messaging system.

TECHNICAL SKILLS:

Hadoop/Big Data: Hadoop, Map Reduce, HDFS, Zookeeper, Kafka, Hive, Pig, Sqoop, Oozie, Flume, Yarn, HBase, Spark with Scala.

No SQL Databases: Hbase, Cassandra, mongoDB

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, UNIX shell scripts

Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

Tools and IDE: Eclipse, Intellij.

PROFESSIONAL EXPERIENCE:

Confidential, New York, New York

Hadoop Administrator 

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hive and Sqoop.
  • Created POC on Hortonworks and suggested the best practice in terms HDP, HDF platform
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Installed Ambari server on the clouds
  • Setup security using Kerberos and AD on Hortonworks clusters/Cloudera CDH
  • Assign access to users by multiple users login.
  • Installed and configured CDH cluster, using Cloudera manager for easy management of existing Hadoop cluster. 
  • Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop. 
  • Extensively using Cloudera manager for managing multiple clusters with petabytes of data. 
  • Having knowledge on documenting processes, server diagrams, preparing server requisition documents 
  • Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory. 
  • Managing the configuration of the cluster to the meet the needs of analysis whether I/O bound or CPU bound 
  • Worked on setting up high availability for major production cluster.  Performed Hadoop version updates using automation tools. 
  • Working on setting up 100 node production cluster and a 40 node backup cluster at two different data centers 
  • Automate operations, installation and monitoring of the Hadoop Framework specifically: HDFS, Map/Reduce, Yarn, HBase.
  • Automated the setup of Hadoop Clusters and creation of Nodes
  • Monitor the improvement of CPU utilization and maintain it.
  • Performance tune and manage growth of the O/S, disk usage, and network traffic
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from LINUX file system to HDFS.
  • Perform architecture design, data modeling, and implementation of Big Data platform and analytic applications for the consumer products
  • Analyze latest Big Data Analytic technologies and their innovative applications in both business intelligence analysis and new service offerings.
  • Worked on installing cluster, commissioning & decommissioning of datanode, namenode recovery, capacity planning, and slots configuration.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance of Mapreduce Jobs.
  • Responsible to manage data coming from different sources.
  • Load and transform large sets of structured, semi structured and unstructured data
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Using PIG predefined functions to convert the fixed width file to delimited file.
  • Worked on tuning Hive and Pig to improve performance and solve performance related issues in Hive and Pig scripts with good understanding of Joins, Group and aggregation and how it does Map Reduce jobs
  • Involved in scheduling Oozie workflow engine to run multiple Hive and Pig job
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Created Oozie workflows to run multiple MR, Hive and pig jobs.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
  • Develop Spark code using Scala and Spark-SQL for faster testing and data processing
  • Involved in the development of Spark Streaming application for one of the data source using Scala, Spark by applying the transformations.
  • Import the data from different sources like HDFS/MYSQL into SparkRDD.
  • Experienced with Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN. 
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDD, Scala. 
  • Involved in gathering the requirements, designing, development and scala testing. 
  • Developing traits and case classes etc in scala.

Environment: Hadoop, HDFS, Pig, Sqoop, Shell Scripting, Ubuntu, Linux Red Hat, Spark, Scala, Hortonworks, Cloudera Manaer.

Confidential, New York, New York

Hadoop Administrator

Responsibilities:

  • Launching Amazon EC2 Cloud Instances using Amazon Web Services (Linux/ Ubuntu/RHEL) and Configuring launched instances with respect to specific applications.
  • Installed application on AWS EC2 instances and also configured the storage on S3 buckets. 
  • Performed S3 buckets creation, policies and also on the IAM role based polices and customizing the JSON template.
  • Implemented and maintained the monitoring and alerting of production and corporate servers/storage using AWS Cloud watch. 
  • Managed servers on the Amazon Web Services (AWS) platform instances using Puppet, Chef Configuration management. 
  • Developed PIG scripts to transform the raw data into intelligent data as specified by business users.
  • Worked in AWS environment for development and deployment of Custom Hadoop Applications.
  • Worked closely with the data modellers to model the new incoming data sets.
  • Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, Map Reduce, Spark and Shell scripts (for scheduling of few jobs.
  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, Oozie, Zookeeper, SQOOP, flume, Spark, Impala, Cassandra with Horton work Distribution.
  • Installed Hadoop, Map Reduce, HDFS, and AWS and developed multiple Map Reduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Involved in creating Hive tables, Pig tables, and loading data and writing hive queries and pig scripts
  • Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and Hbase.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data. Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Worked on tuning Hive and Pig to improve performance and solve performance related issues in Hive and Pig scripts with good understanding of Joins, Group and aggregation and how it does Map Reduce jobs
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Import the data from different sources like HDFS/Hbase into Spark RDD.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Performed real time analysis on the incoming data.
  • Used Spark Streaming to divide streaming data into batches as an input to Spark engine for batch processing. 
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data. 
  • Used Apache Kafka for importing real time network log data into HDFS.
  • Developed business specific Custom UDF's in Hive, Pig.
  • Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Optimized Map Reduce code by writing Pig Latin scripts.
  • Import data from external table into HIVE by using load command
  • Created table in hive and use static, dynamic partition for data slicing mechanism.

Environment: Apache Hadoop, HDFS, MapReduce, Sqoop, Flume, Pig, Hive,HBASE, Oozie, Scala, Spark, Spark Streaming, Kafka, Linux.

Confidential, New York, New York

Hadoop Developer/ Administrator

Responsibilities:

  • Responsible for implementation and ongoing administration of Hadoop infrastructure and setting up infrastructure
  • Cluster maintenance as well as creation and removal of nodes.
  • Evaluation of Hadoop infrastructure requirements and design/deploy solutions (high availability, big data clusters.
  • Cluster Monitoring and Troubleshooting Hadoop issues
  • Manage and review Hadoop log files
  • Works with application teams to install operating system and Hadoop updates, patches, version upgrades as required
  • Created NRF documents which explains the flow of the architecture, which measure the performance, security, memory usage, dependency.
  • Setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users.
  • Help maintain and troubleshoot UNIX and Linux environment.
  • Experience analyzing and evaluating system security threats and safeguards.
  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Developed Pig program for loading and filtering the streaming data into HDFS using Flume.
  • Experienced in handling data from different data sets, join them and preprocess using Pig join operations.
  • Developed Map-Reduce programs to clean and aggregate the data
  • Developed HBase data model on top of HDFS data to perform real time analytics using Java API.
  • Developed different kind of custom filters and handled pre-defined filters on HBase data using API.
  • Imported and exported data from Teradata to HDFS and vice-versa.
  • Strong understanding of Hadoop eco system such as HDFS, MapReduce, HBase, Zookeeper, Pig, Hadoop streaming, Sqoop, Oozie and Hive
  • Implement counters on HBase data to count total records on different tables.
  • Experienced in handling Avro data files by passing schema into HDFS using Avro tools and Map Reduce.
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
  • We used Amazon Web Services to perform big data analytics.
  • Implemented Secondary sorting to sort reducer output globally in map reduce.
  • Implemented data pipeline by chaining multiple mappers by using Chained Mapper.
  • Created Hive Dynamic partitions to load time series data
  • Experienced in handling different types of joins in Hive like Map joins, bucker map joins, sorted bucket map joins.
  • Created tables, partitions, buckets and perform analytics using Hive ad-hoc queries.
  • Experienced import/export data into HDFS/Hive from relational data base and Tera data using Sqoop.
  • Handling continuous streaming data comes from different sources using flume and set destination as HDFS.
  • Integrated spring schedulers with Oozie client as beans to handle cron jobs.
  • Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters
  • Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Hadoop, HDFS, Map Reduce, Hive, Pig,Sqoop, RDBMS/DB, Flat files, Teradata, Mysql, CSV, Avro data files.

Confidential, New York, New York

SDET

Responsibilities:

  • Involved in almost all the phases of SDLC.
  • Executed test cases manually and logged defects using Clear Quest
  • Automated the functionality and interface testing of the application using Quick Test Professional (QTP)
  • Design, Develop and maintain automation framework (Hybrid Framework).
  • Analyze the requirements and prepare automation scripts scenario
  • Develop test data for Regression testing using QTP
  • Wrote Test cases on IBM rational Manual Tester
  • Conducted Cross Browser testing on Different Platform
  • Client Application Testing, Web based Application Performance, Stress, Volume and Load testing of the system using Load Runner 9.5.
  • Analyzed performance of the application program itself under various test loads of many simultaneous Vusers.
  • Analyzed the impact on server performance CPU usage, server memory usage for the applications of varied numbers of multiple, simultaneous users.
  • Inserted Transactions and Rendezvous points into Web Vusers
  • Created Vuser Scripts using VuGen and used Controller to generate and executed Load Runner Scenarios
  • Complete involvement in Requirement Analysis and documentation on Requirement Specification.
  • Prepared use-case diagrams, class diagrams and sequence diagrams as part of requirement specification documentation.
  • Involved in design of the core implementation logic using MVC architecture.
  • Used Apache Maven to build and configure the application.
  • Developed JAX-WS web services to provide services to the other systems.
  • Developed JAX-WS client to utilize few of the services provided by the other systems.
  • Involved in developing EJB 3.0 Stateless Session beans for business tier to expose business to services component as well as web tier.
  • Implemented Hibernate at DAO layer by configuring hibernate configuration file for different databases.
  • Developed business services to utilize Hibernate service classes that connect to the database and perform the required action.
  • Developed JavaScript validations to validate form fields.
  • Performed unit testing for the developed code using JUnit.
  • Developed design documents for the code developed.
  • Used SVN repository for version control of the developed code.

Environment: SQL, Oracle 10g, Apache Tomcat, HP Load Runner, IBM Rational Robot, Clear quest, Java, J2EE, HTML, DHTML, XML, JavaScript, Eclipse, WebLogic, PL/SQL and Oracle.

Confidential, New York, New York

SDET

Responsibilities:

  • Attend requirement meeting with Business Analysts/ Business Users
  • Analyze requirements and use cases, performed ambiguity reviews of business requirements and functional specification documents
  • Automated the functionality and interface testing of the application using Quick Test Professional (QTP)
  • Performed database testing using ODBC by Automation Test scripts
  • Created Driver Script using VB Script to execute QTP application automatically and run number of automated scripts simultaneously.
  • Develop Functional Library for DORS application.
  • Design, Develop and maintain automation framework (Hybrid Framework).
  • Analyze the requirements and prepare automation scripts scenarios
  • Automated the functionality and interface testing of the application using Quick Test Professional (QTP)
  • Prepare Test Plan for DORS(distribution online request system) for manual testing over different releases that covers GUI Testing, Functional Testing, Integration Testing, Regression Testing, Interface Testing, End-to-End Testing and User Acceptance Testing.
  • Create an automation test plan / strategy.
  • Developed MapReduce programs to perform data filtering for unstructured data.
  • Designed the application by implementing Struts Framework based on MVCArchitecture.
  • Designed and developed the front end using JSP, HTML and JavaScript and JQuery.
  • Developed framework for data processing using Design patterns, Java, XML.
  • Implemented J2EE standards, MVC2 architecture using Struts Framework.
  • Implementing Servlets, JSP and Ajax to design the user interface.
  • Used SpringIOC for dependency injection to Hibernate and Spring Frameworks.
  • Developed EJB components that are deployed on Web logic Application Server.
  • Written unit tests using Junit Framework and Logging is done using Log4J Framework.
  • Used Html, CSS, JavaScript and JQuery to develop front end pages.
  • Designed and developed various configuration files for Hibernate mappings.
  • Designed and Developed SQL queries and Stored Procedures.
  • Used XML, XSLT, XPATH to extract data from Web Services output XML

Environment: HP Load Runner, HP Quality Center, HP Quick Test Professional, Maven, Windows XP,J2EE, JSP, JDBC, Hibernate, spring, HTML, XMLCSS, JavaScript and JQuery.

Confidential, Chevy Chase, Maryland

Software Automation Test Engineer

Responsibilities:

  • Executed test cases manually and logged defects using HP Quality Center.
  • Responsibilities included Manual GUI Testing, Functional Testing, Integration Testing, Regression Testing, Interface Testing, End-to-End Testing, Database Testing and User Acceptance Testing.
  • Client Application Testing, Web based Application Performance, Stress, Volume and Load testing of the system.
  • Analyzed performance of the application program itself under various test loads of many simultaneous Vusers.
  • Analyzed the impact on server performance CPU usage, server memory usage for the applications of varied numbers of multiple, simultaneous users.
  • Inserted Transactions and Rendezvous points into Web Vusers.
  • Created Vuser Scripts using VuGen and used Controller to generate and executed Load Runner Scenarios.
  • Connected Multiple Load Generator with Controller to support Additional Vusers
  • Created scripts to enable the Controller to measure the performance of Web server under various load conditions.
  • Automated the functionality and interface testing of the application using Quick Test Professional (QTP).
  • Insert Object Data Verification Check point on Quick Test Professional (QTP) automation testing tools.
  • Verify Back end Data using ODBC after interacting with front-end Automation Test scripts.

Environment: Windows Server 2003, 2005, Java, Java Script, HTML, UNIX, SQL, Oracle 10g, TOAD, IIS, HP Load Runner, HP Quality Center, QTP.

Confidential, Chevy Chase, Maryland

Software Test Analyst

Responsibilities:

  • Involved in preparing Test Plan and Test cases.
  • Developed test cases for automation team for regression testing.
  • Formulated methods to perform positive and negative testing against requirements.
  • Performed backend testing using SQL queries.
  • Reported bugs found during test using Quality Center.
  • Conducted functional, regression, black box and system testing.
  • Reviewed functional design for internal product documentation.
  • Used Quality Center for requirements management, planning, scheduling, running tests defect tracking and managing the defects.
  • Analyzed, tested, and certified application-specific software and performed ambiguity reviews of business requirements and functional specification documents.
  • Developed Manual Test cases and test scripts to test the functionality of the application.
  • Provided test results, graphs, and analysis of application performance data by email or phone during testing to the application developer and manager.
  • Implemented Automated-testing methodology such as Data Driven Testing, Key Word Driven Testing methods.
  • Created and executed regression scripts using Quick Test Professional.
  • Inserted various check points, parameterized the test scripts, and performed regular expression on scripts.
  • Documented tests bug in Quality Center.

Environment:  Java, Java Script, HTML, UNIX, SQL, TOAD, Oracle, Web Logic, Quick Test Professional, Load Runner, Quality Center.

Confidential, New York, New York

Manual Tester

Responsibilities:

  • Analyzed the business requirements and involved in the review discussions.
  • Participated in high level design sessions.
  • Participated in the QA activities for various releases of the Project.
  • Performed System and Integration Testing.
  • Drafted test cases based on Functional Specifications and System Specifications.
  • Prepared of Test Plan and analyze integration system impacts.
  • Involved in Manual Testing of the application for Negative and Positive scenarios.
  • Train team members on the new business functionality of BRD.
  • Performed Regression Testing to end sure that bugs have been fixed and the application was running properly.
  • Extensively involved in executing, analyzing and verifying test results and worked with developers to resolve issues.
  • Communicated project business issues to appropriate business leadership groups.
  • Responsible for Object Repository, maintained it in the central repository and made changes as new changes were developed
  • Wrote SQL statements to extract Data and verified the output Data of the reports.
  • Prepared Requirement Traceability Matrix (RTM) to establish traceability between requirements and test cases.
  • Modified and maintained test cases due to changes in the requirements.
  • Detected, reported and classified bugs in Test Director.
  • Used Test Director for managing test execution and defect tracking of all issues.
  • Conducted internal and external reviews as well as formal walkthroughs, and participated in status meetings.

Environment: Windows, SQL Server, Oracle, TOAD, Visual Basic, Win Runner, and Test Director.

We'd love your feedback!