Hadoop Administrator Resume
SUMMARY:
- 7 years overall experience in Systems Administration, Application Administration in diverse industries.
- Over 3+ years of comprehensive experience as a Hadoop Administrator.
- Good understanding/knowledge of Hadoop architecture and various components such as
- HDFS, Map Reduce, Yarn, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming.
- Hands on experience in installing, configuring and using ecosystem components like Hadoop Map Reduce, Hive, Sqoop, Pig, HDFS, Hbase, ZooKeeper, Oozie, and Flume.
- Experience in Managing scalable Hadoop clusters including Cluster designing, provisioning, custom configurations, monitoring and maintaining using different Hadoop distributions: Cloudera CDH, Apache Hadoop and Hortonworks.
- Well versed in installing, upgrading & managing Apache, Cloudera (CDH4) for Hadoop.
- Experience with Installed Hadoop patches, major and minor version upgrades of Hadoop Apache, Cloudera and Hortonworks distributions.
- Experience monitoring workload, job performance and collected metrics for Hadoop cluster when required using Ganglia, Nagios and Cloudera Manager.
- Advanced knowledge in troubleshooting operations issues, finding root causes faced in managing Hadoop clusters.
- Analyzed large data sets using Hive queries and Pig Scripts.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa
- Experienced in extending Hive and Pig core functionality by writing custom UDFs using Java.
- Experience in Data Warehousing and ETL processes.
- Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Involved in Performance tuning activity in different levels. For optimizing Map Reduce programs which run on large datasets, that are written once, executed over and over.
- Experience in designing and building a Highly Available (HA) and Fault Tolerant (FT) distributed systems.
- Experience in developing ETL process using Hive, Pig, Sqoop and Map-Reduce Framework.
- Experience in troubleshooting, finding root causes, Debugging and automating solutions for operational issues in the production environment
- Ability to learn and adapt quickly and to correctly apply new tools and technology.
- Co-ordination with software development team, project implementation, analysis, technical support, data Conversion and deployment.
- Worked extensively on Hardware capacity management and procurement process.
- Skilled technologist with deep expertise in aligning solutions against strategic roadmaps to support business goals.
- Expert communicator, influencer, and motivator who speaks the language of business and technology, resulting in excellent rapport with fellow employees, peers and executive management.
- Good problem-solving skills, quick learner, effective individual and team player, excellent communication and presentation skills.
WORK EXPERIENCE:
Hadoop Administrator
Confidential
Responsibilities:
- Implemented Name node High Availability on the Hadoop cluster to overcome single point of failure.
- Installed Cloudera Manager on an already existing Hadoop cluster.
- Involved in efficiently collecting and aggregating large amounts of streaming log data into Hadoop Cluster using Apache Flume.
- Responsible for ongoing maintenance, expansion and improvement of a cross-regional ESX infrastructure, supporting over 400 Virtual Servers, as well as offshore Desktops systems.
- Closely worked and coordinate efforts with the storage team to analyze performance data, and used it to plan and deploy methods of maximizing performance and reliability.
- User behavior and their patterns were studied by performing analysis on the data stored in HDFS using Hive.
- Used HiveQL to write Hive queries from the existing SQL queries.
- Responsible for designing and implementing ETL process to load data from different sources, perform data mining and analyze data using visualization/reporting tools to leverage the performance of OpenStack
- The analyzed data mined from huge volumes of data was exported to MySQL using Sqoop.
- Developed custom MapReduce programs and custom User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
- Involved in installing and configuring Kerberos to implement security to the Hadoop cluster and providing authentication for users.
- Worked on installation of DataStax Cassandra cluster.
- Installing and monitoring the Hadoop cluster resources using Ganglia and Nagios.
- Worked with Big Data Analysts, Designers and Scientists in troubleshooting map reduce job failures and issues with Hive, Pig, Flume, Apache Spark, Sentry,
- Managed Extracting, Loading and transforming in and out of hadoop using HIVE and IMPALA.
- Rebuilt existing nagios system to integrate tightly with puppet for automatic monitoring configuration Collaborated closely with engineers to track down various issues causing segfaults and performance degredations in our environment
- Instituted change management process using git and puppet, added validation and automated problem reporting
- Handled HUE the open source web based interface to interact with hadoop services.
- Utilized Apache Spark for Interactive Data Mining and Data Processing.
Environment: Hadoop, Map Reduce, Hive, Oozie, Sqoop, Flume, Cloudera Manager, Shell Script, Hortonworks.
Confidential, Los Angeles, CA
Hadoop Administrator
Responsibilities:
- Hands on experience with Apache Hadoop Ecosystem components such as Scoop, Hbase and Mapreduce.
- Installation, monitoring, managing, troubleshooting, applying patches in different environments such as Development Cluster, Test Cluster and Production environments.
- Demonstrate and understanding of concepts, best practices and functions to implement a Big Data solution in a corporate environment.
- Help design of scalable Big Data clusters and solutions.
- Implemented Fair schedulers to share the resources of the cluster for the mapreduce jobs given by the users.
- Experience on writing automation scripts and setting up cron jobs to maintain cluster stability and healthy.
- Monitoring and controlling local file system disk space usage, local log files, cleaning log files with automated scripts.
- As a Hadoop admin, monitoring cluster health status on daily basis, tuning system performance related configuration parameters, backing up configuration XML files.
- Work with Hadoop developers, designers in troubleshooting map reduce job failures and issues and helping to developers.
- Work with network and Linux system engineers/admin to define optimum network configurations, server hardware and operating system.
- Evaluate and propose new tools and technologies to meet the needs of the organization.
- Production support responsibilities include cluster maintenance and on call support on weekly rotation 24/7
Environment: RHEL, CentOS, Ubuntu, CDH3, Cloudera Manager, HDFS, Map, Reduce, Hbase, MySQL, Shell Scripts, Ganglia, Nagios.
Confidential, Dallas, TX
Oracle DBA/ Hadoop Administrator
Responsibilities:
- Installed Namenode, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node.
- Installed and Configured HDP2.2
- Responsible for implementation and ongoing administration of Hadoop infrastructure.
- Monitored already configured cluster of 40 nodes.
- Installed and Configured Optumera platform cluster of 70 nodes.
- Installed and configured Hadoop components Hdfs, Hive, HBase.
- Communicating with the development teams and attending daily meetings.
- Addressing and Troubleshooting issues on a daily basis.
- Launched R-statistical tool for statistical computing and Graphics.
- Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
- Cluster maintenance as well as creation and removal of nodes.
- Monitor Hadoop cluster connectivity and security
- Manage and review Hadoop log files.
- File system management and monitoring.
- HDFS support and maintenance.
- Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.
- Taking Backups on scheduled time and Monitoring Scheduled Backups.
- Creating and dropping of users, granting and revoking rights as and when required.
- Checking of Oracle files for abnormal behaviors.
- Day to day Performance Monitoring and tuning of database.
- Diagnose and Troubleshoot Database Problems and failures.
- Involved in the Design (Logical & Physical) of Databases
- Preparing Backup and Recovery Plan.
- Create oracle databases (single instance) on configured servers using both the DBCA & Database creation scripts.
- Monitor database systems and provide technical support for database system issues and application performance enhancement/improvement using both Oracle 11g Grid Control & SQL*Plus, as well as UNIX shell scripts.
- Worked closely with Application Developers to design/create new Oracle databases/create new schemas/database objects, as well as making database changes from DEV to PROD, via TST following requests/demands from the development team.
- Backup/Recovery of Oracle 9i/10g databases using Oracle RMAN.
Environment: Hadoop 2.X, Hdfs, Hive, Hbase, Oracle 11g, RMAN, Datapump, SQL TRACE, TKPROF, EXPLAIN PLAN
Confidential, Atlanta, GA
Oracle DBA Administrator
Responsibilities:
- Performed Installation and configuration of Oracle database on HP-UX platform.
- Successfully performed migrations from Oracle 9i/10g to 10gR2 RAC (Real Application) Database.
- Implemented Logical and Physical Standby Databases for RAC cluster on Sun Solaris platform using Data Guard feature of Oracle 10g R2.
- Scheduling the Physical backups (hot & cold) in CRON tab using RMAN utility and monitoring the scheduled jobs.
- Responsible for Creating Users, Groups, Roles, Profiles and assigning the users to groups and grant necessary privileges to the relevant groups.
- Constantly monitor the performance (V$ dynamic performance views at peak load) of the Databases and viewing Alert log files & Trace files.
- Used to run the scripts to check the status of databases such as growing table sizes, extent allocation, free space, used space, fragmentation etc.
- Performed space management, capacity planning, disaster recovery and overall maintenance of the databases.
- Used SQL TRACE, TKPROF, EXPLAIN PLAN utilities for optimizing and tuning SQL queries.
- Maintained the data integrity and security using integrity constraints and database triggers.
- Provided 24X7 support for all the production and development databases.
- Successfully performed data replication using Materialized views and Oracle Streams in Oracle 11gR2.
- Cloned/Migrated databases using RMAN and traditional Datapump export/import utilities in Oracle 11gR2.
- Implemented recovery strategies whenever required and successfully recovered databases in case database crash, media/disk failures by using RMAN.
Environment: Oracle11g, HP-UX 11, Sun Solaris 11.0, RAC, Oracle Streams, RMAN, Datapump, SQL TRACE, TKPROF, EXPLAIN PLAN
Confidential, CA
Oracle DB administrator
RESPONSIBILITIES:
- Installation, configuration and upgrading of Oracle software and related products.
- Work as team member for providing 24X7 support.
- Manages more than 200 instances (Development, QA & Production) - all are in 11g.
- The largest database is 30+TB.
- Manages multiple Dataguard / RAC set-ups.
- Monitoring the databases, estimating the space requirement, and allocating the resources for the databases and capacity planning.
- Establish and maintain sound backup and recovery policies and procedures.
- Upgrade / Migration of databases.
- Installing and configuring Grid agent for multiple nodes.
- Implement and maintain database security (create and maintain users and roles, assign privileges).
- Perform database tuning and performance monitoring.
- Developing shell scripts for monitoring database servers.
- Dataguard Switch over/ Failover testing at Database level.
- Performed technical trouble shooting and give support to development team.
- Take up with Oracle Corporation for technical support when required.
- Setup and maintain documentation and standards for database administration.
- Preparation of DR policy & implementation of Oracle Data Guard.
ENVIRONMENT: Oracle 11g / 10g / 9i / 8i / 7.x, Grid Control, TOAD, Unix HP, Linux, Unix Sun Solaris 8 & 9 (Microsoft Windows 2000, XP).
Confidential
Jr.Oracle DB administrator
Responsibilities:
- Responsible for maintenance of production.
- Created tables, views, materialized views
- Database administration tasks included performing regular Hot&Cold Backups, Job Scheduling.
- Managing Database Structure, Storage Allocation, Table/Index segments
- Cloned databases from PRODUCTION to create TEST, DEVELOPMENT and UAT
- Configured and managed OEM to stabilized oracle database into healthy environment.
- Monitoring Growth of tables and undertaking necessary re-organization of database as and when required.
- Trouble shooting performance issues using Explain Plan and TKPROF.
- Involved in numerous Unix Korn Shell Scripting pertaining to the activities of the DBA
- Maintenance & modification of Shell Scripts.
- Implemented Oracle's Advanced Features such as Partition table, Partition Index, Function Based Indexes.
- Develop, document and enforce database standards and procedures.
- Generated scripts for daily maintenance activity like analyzing the tables/indexes, and monitoring their growth.
- Troubleshooting and fixing database listener issues.
Environment: Oracle 10g, RedHat Linux, Sun Solaris, Windows, Shell Scripts, RMAN, OEM/GridControl, SQL*Loader.