We provide IT Staff Augmentation Services!

Hadoop Admin Resume

0/5 (Submit Your Rating)

San Diego, CA

PROFESSIONAL SUMMARY:

  • 7 years of software development experience with 2.5+ years of experience in administering Hadoop Ecosystem and 3 plus years of experience in Linux administering.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, CLOUDERA, HORTONWORKS, MapR distributions.
  • Strong knowledge on Hadoop HDFS architecture and Map - Reduce framework.
  • Involved in capacity planning for the Hadoop cluster in production.
  • Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
  • Experience in performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
  • Architected and implemented automated server provisioning using puppet.
  • Experience in performing minor and major upgrades.
  • Experience in performing commissioning and decommissioning of data nodes on Hadoop cluster.
  • Strong knowledge in configuring Name Node High Availability and Name Node Federation.
  • Familiar with writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
  • Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and also using fast loaders and connectors Experience.
  • Experience in understanding the security requirements for Hadoop and integrating with Kerberos authentication infrastructure- KDC server setup, crating realm /domain, managing principles, generation key tab file each service and managing keytab using keytab tools.
  • Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
  • Worked on disaster management with Hadoop cluster.
  • Built data transform framework using MapReduce and Pig.
  • Experience in monitoring and managing 80 node Hadoop cluster.
  • Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
  • Experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER) using Cloudera Manager and Hortonworks Ambari.
  • Supported MapReduce programs running on the cluster.
  • Manage and review Hadoop log files.
  • Done stress and performance testing, benchmark for the cluster.
  • Built ingestion framework using flume for streaming logs and aggregrating the data into HDFS.
  • Hands-on experience with “Productionalizing” Hadoop applications such as administration, configuration management, debugging and performance tuning.
  • Worked with application team via scrum to provide operational support, install Hadoop updates, patches and version upgrades as required.
  • Prototyped the proof-of-concept with Hadoop 2.0 (YARN).

TECHNICAL SKILLS:

Hadoop Ecosystem Components: HDFS, MapReduce, Pig, Hive, Oozie, Sqoop, Flume & Zookeeper.

Languages and Technologies: Core Java, C, C++, and Data Structures, algorithms.

Operating Systems: Windows, Linux & Unix.

Scripting Languages: Shell scripting, puppet.

Networking: TCP/IP Protocol, Switches & Routers, OSI Architecture, HTTP, NTP & NFS.

Databases: SQL &, NoSQL - Cassandra.

PROFESSIONAL EXPERIENCE:

Hadoop Admin

Environment: Map Reduce, HDFS, Hive, Pig, Flume, Sqoop, UNIX Shell Scripting, Nagios, Kerberos

Confidential, San Diego, CA

Responsibilities:

  • Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
  • Performed various configurations which Includes, networking and iptable, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
  • Implemented authentication and authorization service using Kerberos authentication protocol.
  • Performed benchmarking on the Hadoop cluster using different bench marking mechanisms.
  • Tuned the cluster by Commissioning and decommissioning the DataNodes.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Upgraded the Hadoop cluster from cdh3 to cdh4.
  • Deployed high availability on the Hadoop cluster quorum journal nodes.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Network file system for NameNode Meta data backup.
  • Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
  • Configured and deployed hive metastore using MySQL and thrift server.
  • Development of Pig scripts for handling the raw data for analysis.
  • Maintained, audited and built new clusters for testing purposes using the Cloudera manager.
  • Deployed and configured flume agents to stream log events into HDFS for analysis.
  • Configured Oozie for workflow automation and coordination.
  • Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
  • Custom shell scripts for automating redundant tasks on the cluster.
  • Worked with BI teams in generating the reports and designing ETL workflows on Pentaho. involved in loading data from UNIX file system to HDFS.
  • Defined Oozie workflow based on time to copy the data upon availability from different Sources to Hive.

Hadoop Admin

Environment: Linux, Map Reduce, HDFS, Hive, Pig, Sqoop, Flume, Ganglia, Nagios, Kerberos

Confidential, Pleasanton, CA

Responsibilities:

  • Performed both Major and Minor upgrades to the existing cluster and also rolling back to the previous version.
  • Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
  • Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
  • Implemented Map Reduce jobs in HIVE by querying the available data.
  • Used Ganglia and Nagios to monitor the cluster around the clock.
  • Implemented NFS, NAS and HTTP servers on Linux servers.
  • Created a local YUM repository for installing and updating packages.
  • Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
  • Designed the shell script for backing up of important metadata.
  • HA implementation of Name Node to avoid single point of failure.
  • Designed the cluster so that only one Secondary name node daemon could be run at any given time.
  • Implemented Name node backup using NFS. This was done for High availability.
  • Supported Data Analysts in running Map Reduce Programs.
  • Worked on analyzing data with Hive and Pig.
  • Running cron-tab to back up data.
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Implemented Kerberos for authenticating all the services in Hadoop Cluster.
  • Deployed Sqoop server to perform imports from heterogeneous data sources to HDFS.
  • Designed and allocated HDFS quotas for multiple groups.
  • Configured IPTABLES rules to allow the connection of application servers to the cluster and also setup NFS exports list and blocked unwanted ports.
  • Configured Flume for efficiently collecting, aggregating and moving large amounts of log Data from Many different sources to the HDFS.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from different sources.

Linux Administrator

Confidential

Responsibilities:

  • Administration of RHEL4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
  • Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts, Xen servers.
  • Installing RedHat Linux using kicks tart and applying security polices for hardening the server based on the company policies.
  • Installed and verified that all AIX/Linux patches or updates are applied to the servers.
  • Installing, administering RedHat using Xen, KVM based hypervisors.
  • RPM and YUM package installations, patch and other server management.
  • Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
  • Worked and performed data-center operations including rack mounting, cabling.
  • Installed, configured, and maintained Weblogic10.x and Oracle10g on Solaris & RedHat Linux.
  • Set up user and group login ID's, printing parameters, network configuration, password, resolving permissions issues, user and group quota.
  • Configuring multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
  • Installing and configuring Apache and supporting them on Linux production servers.
  • Troubleshooting Linux network, security related issues, capturing packets using tools such as IPtables, firewall, TCP wrappers, NMAP.
  • Worked on resolving production issues and documenting Root Cause Analysis and updating the tickets using BMC Remedy.

Linux Administrator

Confidential

Responsibilities:

  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
  • Monitoring the System activity, Performance, Resource utilization.
  • Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administration tasks like cron jobs, installing packages, and patches.
  • Extensive use of LVM, creating Volume Groups, Logical volumes.
  • Performed RPM and YUM package installations, patch and other server management.
  • Performed scheduled backup and necessary restoration.
  • Configured Domain Name System (DNS) for hostname to IP resolution.
  • Troubleshooting and fixing the issues at User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non business hour.

We'd love your feedback!