We provide IT Staff Augmentation Services!

Senior Hadoop Admin Resume

0/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • 71/2 years of professional experience including 4 years of Hadoop Administration and 3+ years as Linux Admin.
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters & Cloudera Hadoop Distribution.
  • Strong technical, administration, & mentoring knowledge in Linux, Bigdata/Hadoop.
  • Experience in designing, installing and configuring complete Hadoop ecosystem (components such as pig, hive, oozie, HBase, flume, zookeeper).
  • Expertise in managing the Hadoop infrastructure with Cloudera Manager.
  • Providing security for Hadoop Cluster with Kerberos, Active Directory/LDAP, and TLS/SSL utilizations and dynamic tuning to make cluster available and efficient.
  • Involved in customer interactions, business user meetings, vendor calls and technical team discussions to take right choices in terms of design and implementations and to provide best practices for the organization.
  • Experience in managing the cluster resources by implementing fair scheduler and capacity scheduler.
  • Experience in developing and scheduling ETL workflows in Hadoop using Oozie.
  • Experience in tools like puppet to automate Hadoop installation, configuration and monitoring.
  • Installation, patching, upgrading, tuning, configuring and troubleshooting Linux based operating systems RedHat and Centos and virtualization in a large set of servers.
  • Experience in Install and configuration of Web hosting administration HTTP, FTP, NFS & SSH
  • Worked on Firewall implementation & Load balancer between various Windows servers.
  • Efficiency in installing, configuring and implementing the LVM, and RAID Technologies using various tools like Veritas volume manager.
  • Experience in Installation of VMware ESX server and creation of VMs and install different guest OS.
  • Experience with system integration, capacity planning, performance tuning, system monitoring, system security, operating system hardening and load balancing.
  • Extensively worked on configuring & administering YUM, RPM's, NFS, DNS, and DHCP, Mail servers.
  • Experience in Managing various Network related tasks such as TCP/IP, NFS, DNS, DHCP and SMTP.
  • Proficient in OS upgrades and Patch loading as and when required.
  • Experience in integration of various data sources like Oracle, DB2, SQL server.
  • Good knowledge on reading data from Cassandra and also writing to it.
  • Good understanding of No SQL databases such as HBase and Mongo DB.

TECHNICAL SKILLS:

Bigdata: Apache Hadoop, Cloudera, HBase, Hive, Mapreduce, Zookeeper, Oozie, Sqoop, Flume, Pig

Languages: C, Java, Shell scripting, Java scripting

Operating Systems: RedHat Linux, CentOS, Windows

Databases: IBM DB2, Oracle, SQL server 2000/2005/2008 , MYSQL

Networking and Protocols: Tcp/IP, HTTP, FTP, SNMP, LDAP, DNS

Application Servers: Apache HTTP webserver, Websphere, weblogic

Tools: Puppet, Nagios, Ganglia

PROFESSIONAL EXPERIENCE:

Senior Hadoop Admin

Confidential, Atlanta, GA

Responsibilities:

  • Planning, installing, configuring, maintaining, and monitoring Hadoop Clusters and using Apache, Cloudera (CDH3, CDH4, CDH5) distributions
  • Hands - on experience on major components in Hadoop Ecosystem including HDFS, Yarn, Hive, Impala, Flume, Zookeeper, Oozie and other ecosystem Products.
  • Experience in Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade
  • Capacity planning, hardware recommendations, performance tuning and benchmarking.
  • Cluster balancing and performance tuning of Hadoop components like HDFS, Hive, Impala, MapReduce, Oozie work flows.
  • Taking Backups of meta-data & databases before upgrading BDA cluster and deploying patches.
  • Implemented NameNode backup using NFS. This was done for High availability.
  • Adding and Decommissioning Hadoop Cluster nodes Including Balancing HDFS block data.
  • Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
  • Configuring Kerberos and AD/LDAP for Hadoop cluster
  • Worked with the systems engineering team to propose and deploy new hardware and software environments required for Hadoop and to expand existing environments.
  • Implemented Kerberos security across the cluster
  • Working with data delivery teams to setup new Hadoop users, includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce /YARN access for the new users.
  • Experience in Setting up Data Ingestion tools like Flume, Sqoop, SFTP.
  • Install and Set up HBASE and Impala
  • Setting up Quotas on HDFS, implementing Rack Topology Scripts configuring Sqoop to import and export data from HDFS to RDBMS and vice-versa.
  • Handle the data exchange between HDFS & Web Applications and databases using Flume and Sqoop.
  • Used Hive and created Hive tables and involved in data loading.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
  • Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
  • Supported technical team members in management and review of Hadoop log files and data backups.
  • Participated in development and execution of system and disaster recovery processes.
  • Monitoring HadoopCluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics and Charge Back customers on their Usage.
  • Performance Tuning, Client/Server Connectivity and Database Consistency Checks using different Utilities.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, LDAP, NoSQL, DB2 and Unix/Linux.

Senior Hadoop Admin

Confidential, Richardson, TX

Responsibilities:

  • Performed various configurations, which includes, networking and IPTable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
  • Implemented authentication service using Kerberos authentication protocol.
  • Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
  • Regular disk management like adding /replacing hot swap able drives on existing servers/workstations, partitioning according to requirements, creating new file systems or growing existing one over the hard drives and managing file systems.
  • Master nodes disks are configured with RAID 1+0.
  • A NFS filer volume mountable on both master nodes is needed for high availability setup
  • Performed benchmarking on the Hadoop cluster using different benchmarking mechanisms.
  • Tuned the cluster by Commissioning and decommissioning the Data Nodes.
  • Upgraded the Hadoop cluster from cdh3 to cdh4.
  • Deployed a Hadoop cluster using cdh3 integrated with Nagios and Ganglia.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Deployed high availability on the Hadoop cluster quorum journal nodes.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Configured Ganglia which include installing GMOND and GMETAD daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for Name Node Metadata backup.
  • Performed cluster back using DISTCP, Cloudera manager BDR and parallel ingestion.
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured and deployed hive metastore using MySQL.
  • Used hive schema to create relations in pig using Hcatalog.
  • Developed automated scripts using Unix Shell for running Balancer, file system health check, Schema Creation in Hive and User/Group creation on HDFS.
  • Worked on NoSQL databases including HBase, MongoDB, and Cassandra.
  • Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
  • Deployed and configured flume agents to stream log events into HDFS for analysis.
  • Performed deploying yarn, which facilitate multiple applications to run on the cluster.
  • Configured Oozie for workflow automation and coordination.
  • Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
  • Custom shell scripts for automating redundant tasks on the cluster.

Environment: Hadoop, HDFS, MapReduce, Hive, Oozie, Cloudera, MySQL, SQL and Ganglia.

Hadoop Admin

Confidential, Dallas, TX

Responsibilities:

  • Monitored workload, job performance and capacity planning using Cloudera Manager
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Retrieved data from HDFS into relational databases with Sqoop. Parsed cleansed and mined useful and meaningful data in HDFS using Map-Reduce for further analysis
  • Fine tuning hive jobs for optimized performance.
  • Installed, configured and deployed a Hadoop cluster for development, production and testing.
  • As a admin followed standard Back up policies to make sure the high availability of cluster.
  • Decommissioning and commissioning the Node on running Hadoop cluster.
  • Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster. control using zookeeper and quorum journal nodes.
  • Partitioned and queried the data in Hive for further analysis by the BI team.
  • Involved in extracting the data from various sources into Hadoop HDFS for processing.
  • Effectively used Sqoop to transfer data between databases and HDFS.
  • Worked on streaming the data into HDFS from web servers using flume.
  • Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
  • Developed Map-Reduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns.
  • The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Automated workflows using shell scripts pull data from various databases into Hadoop
  • Deployed Hadoop Cluster in Fully Distributed and Pseudo-distributed modes.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
  • Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs.
  • Experience in managing and reviewing Hadoop log files.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Provisioning, building and support of Linux servers both Physical and Virtual using VMware for Production, QA and Developers environment.
  • Installing Patches and packages on Unix/Linux Servers.
  • Shell scripting for Linux/Unix Systems Administration and related tasks.

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, CLOUDERA MANAGER, VMware, Shell Scripting

Hadoop Administrator

Confidential, Woodsland, TX

Responsibilities:

  • Experience in setup, configuration and management of security for Cloudera Hadoop clusters.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Responsible for day-to-day activities which includes HDFS support and maintenance, Cluster maintenance, creation/removal of nodes, Cluster Monitoring/ Troubleshooting,
  • Involved in manage and review the Hadoop log files, Backup and restoring, capacity planning
  • Worked with Hadoop developers and operating system admins in designing scalable supportable infrastructure for Hadoop.
  • Installed RHEL 4.0 using kickstart and custom building the servers.
  • Responsible for cluster availability and available 24x7 on call support.
  • Managing and reviewing data backups and Hadoop log files.
  • Responsible for deciding the hardware configurations for the cluster along with other teams.
  • Implemented the Cluster High Availability incase of crash or planned maintenance.
  • Responsible for scheduling jobs in Hadoop using Fair scheduler .
  • Involved in configuring Oozie workflow engine to run multiple Hive jobs.
  • Configured Sqoop and developed scripts to extract data from DB2 into HDFS.
  • Worked extensively with Sqoop for importing metadata from DB2.
  • Involved in administration, configuration management, monitoring, debugging and performance tuning of Hadoop environments.
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
  • Configured the Alerts using Cloudera Manager for the Hadoop cluster.
  • Installed and configured Flume, Hive, Sqoop, Zookeeper and Oozie on the Hadoop cluster.
  • Monitored all Mapreduce Read Jobs running on the cluster using Cloudera Manager and ensured that they were able to read the data to HDFS without any issues.
  • Worked in deployment of applications on a J2EE application server (preferably WebLogic ).
  • Performance tuning of JVM heap size, garbage collections, java stack and Native thread & production performance.
  • Wrote Shell scripts for day to day admin tasks. Resolved issues related to Application, Hardware and UNIX Environment.
  • Setting up Identity, Authentication, and Authorization.
  • Handle the upgrades and Patch updates.
  • Configuring TLS/SSL for the Hadoop components/BDA cluster to provide data security in transit.
  • Configuring, Troubleshooting of SSH/NFS /NIS/AutoNFS/DNS/Yum repository.
  • RedHat Linux Startup process - init - /etc/inittab - /etc/init.d service scripts runlevels, chkconfig, ntsysv utilities.
  • Maintain extensive documentation on Hadoop cluster, policies and configurations .
  • Implemented Name Node backup using NFS. This was done for High availability.
  • Diligently teaming with the infrastructure, network, database, application and business intelligence teams to guarantee high data quality and availability.

Environment: Cloudera - CDH, Map Reduce, Cloudera Manager, Hive, Sqoop, Zookeeper, Oozie, Flume, CENTOS, Puppet, DB2, RHEL, JVM, Shell Scripts, NFS

Linux Admin

Confidential

Responsibilities:

  • Design, build, install and configure Red Hat Enterprise Linux servers ( RHEL5, RHEL6) on bare metal servers of HP ProLiant DL380.
  • Installation and configuration of RHEL,CENTOS OS in servers in virtual VM machines (VMware) and proxmox virtual machine.
  • Configuring Fortigate firewalls and monitoring firewall logs(to enhance security, Network load balance, Application and user monitoring).
  • User, Group creation monitoring and maintain log for system status/system health using Linux commands and Nagios, Top, Genome system monitor.
  • Installed and Configured Redhat Linux Kickstart and booting from SAN/NAS.
  • Performed SAN Migration on RHEL servers.
  • Experience in using protocols like NFS, SSH, SFTP & DNS.
  • Discussing with the vendors for new hardware requirements.
  • Doing capacity Assessment for new requests of servers i.e. calculating CPU and Memory for new servers according to the current/future Applications running on the system.
  • System Performance, and Tuning
  • Performed package installations, maintenance, periodic updates and patch management.
  • Creating the Linux file system.
  • Installation/Administration of TCP/IP, NFS, DNS, NTP, Auto mounts, Send mail and Print servers as per the client's requirement.
  • Experience in troubleshooting samba related issues.
  • Performed disk administration using LVM, Linux Volume Manager (LVM), Veritas Volume Manager 4.x/5.x.
  • Performance monitoring on Linux servers using iostat, netstat, vmstat, sar, top & prstat.
  • Installed VMWare ESX4.1 to perform virtualization of RHEL servers.
  • Installed and configured DHCP, DNS, NFS.
  • Configured iptables on Linux servers.
  • Performed Package administration on Linux using rpm, yum and Satellite server.
  • Automation of various administrative tasks on multiple servers using Puppet.
  • Deployed Puppet, Puppet Dashboard, and Puppet DB for configuration management to existing infrastructure.
  • Proficient in installation, configuration and maintenance of applications like Apache, LDAP, PHP
  • Resolved config issues and problems related to OS, NFS mounts, LDAP user ids DNS and issues.
  • Worked on VMware, VMware View, vSphere 4.0. Dealt with ESX, ESXi servers.
  • Enhanced and simplified vCenter server 4.0.
  • Performed installing, configuring and trouble-Shooting web servers like IBM HTTP Web Server, Apache Web Server, Websphere Application Servers, and Samba Server on Linux (Redhat & CentOS).
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Create and update technical documentation for team members.

Environment: Red-Hat Linux Enterprise servers (HP Proliant DL 585, BL ... ML Series, SAN (Netapp), VMware Virtual Client 3.5, VMware Infrastructure 3.5, Bash, CentOS, LVM, Windows 2003 server, NetBackup, Veritas Volume Manager, Samba, NFS.

Linux Admin

Confidential

Responsibilities:

  • Administration of RHEL, which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Creating, cloning Linux Virtual Machines.
  • Installing RedHat Linux using kick start and applying security polices for hardening the server based on the company policies.
  • RPM and YUM package installations, patch and other server management.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Tech and non-tech refresh of Linux servers, which includes new hardware, OS, upgrade, application installation, testing.
  • Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, and user and group quota.
  • Creating physical volumes, volume groups, logical volumes.
  • Gathering requirements from customers and business partners and design, implement and provide solutions in building the environment.
  • Installing and configuring Apache and supporting them on Linux production servers.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as IPtables, firewall, TCP wrappers, NMAP.

Environment: Red-Hat Linux Enterprise servers (HP Proliant DL 585, BL ... ML Series, SAN (Netapp), Veritas Cluster Server 5.0, Windows 2003 server, Shell programming, Jboss 4.2, JDK 1.5,1.6, VMware Virtual Client 3.5, VMware Infrastructure 3.5

We'd love your feedback!