We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

0/5 (Submit Your Rating)

Sanjose, CA

SUMMARY

  • 7+ years of software development experience, which includes 3 plus years on Big Data technologies using Hadoop ecosystems like Hive, Mapreduce, Pig, Sqoop, Oozie, Flume and 4 years in the Linux Administration.
  • In - depth expertise in deploying large-scale, multi cluster Hadoop implementations.
  • Acute knowledge on Hadoop HDFS architecture and MapReduce framework.
  • Experience working with Hive data warehouse systems, extending the Hive library to query data in custom formats.
  • Worked on high volume streaming data processing using FLUME.
  • Worked on developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using MapReduce, HIVE and analyze data using visualization/reporting tools.
  • Installation and configuration of Name Node High Availability (NNHA) using Zookeeper.
  • Knowledge in deploying scalable Hadoop cluster on Cloud environments like Amazon AWS, Rack-Space and Amazon S3 & S3N underlying file system for Hadoop.
  • Experience in developing applications using web technologies.
  • Good knowledge on Hadoop 2.0 (YARN).
  • Experienced in providing security to Hadoop cluster with Kerberos.
  • Hands on experience on Unix/Linux environments, which included software installations/upgrades, shell scripting for job automation and other maintenance activities.
  • Working experience with large scale Hadoop environments, which includes planning, designing, installation and configuration, set up. Provided guidelines for best practices, benchmarking and performance tuning.
  • Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
  • Experience working 24/7 on-call schedules supporting large-scale big data implementations for troubleshooting and issue resolution.
  • Possess excellent analytical, problem solving, organizational and interpersonal skills. Self-motivated attitude with the ability to move cleanly from theoretical to implementation thinking.
  • Acute knowledge of hardware, software, networking, applications and change management processes.
  • Self-starter and team player, capable of working independently and motivating a team of professionals.

TECHNICAL SKILLS

Hadoop Ecosystem: Hive, Pig, Zookeeper, Map-Reduce, Sqoop, Ganglia, Nagios, Cloudera Manager, Flume, Oozie

Operating Systems: Linux (CentOS & Red hat), MS-Windows NT/Server 2003MS-DOS, UNIX

Languages: C, Java, SQL, UNIX Shell Scripting

Databases: Oracle, DB2, MS SQL Server, MySQL

PROFESSIONAL EXPERIENCE

Confidential - Sanjose, CA

Hadoop Administrator

Responsibilities:

  • Use CFEngine for configuring the Linux boxes and develop Python scripts for managing important files and automating important tasks.
  • Part of a team responsible for upgrading one of the clusters from CDH3 to CDH4 with High Availability.
  • Responsible for performance tuning and benchmarking Hadoop cluster.
  • Managing and monitoring the Hadoop cluster and platform infrastructure, recovering from node failures and troubleshooting common Hadoop cluster issues.
  • Worked closely with Big Data Architect and Development team for implementing and administering new components like Hue and Oozie.
  • Documenting environments, processes, and procedures for reference and/or training purposes.
  • Helping the development team in solving their Hadoop issues and tweaking configuration for better performance and reliability.
  • Improved Hadoop monitoring by replacing Ganglia with Zabbix to bring it in line with their current monitoring standards.
  • Created run books for alerts with a severity level of average or higher.
  • Created iptables rules to filter the traffic coming into the Linux systems.

Environment: Python, CFEngine, Zabbix API, Flume, Zookeeper, Pig, Hive, Oozie, Hue, GitHub

Confidential - San Diego, CA

Hadoop Administrator

Responsibilities:

  • Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, and backup & DR systems.
  • Involved in analyzing system failures, identifying root-cause and recommendation of remediation actions. Documented issue log with solutions for future references.
  • Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
  • Monitored multiple hadoop clusters environments using Ganglia and Nagios. Monitoring workload, job performance and capacity planning using Cloudera Manager.
  • Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
  • Installed and configured Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
  • Installation and configuration of Name Node High Availability (NNHA) using Zookeeper.
  • Created local YUM repository for installing and updating packages.
  • Analyzed web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
  • Experienced in Linux Administration tasks like IP Management (IP Addressing, Ethernet Bonding, Static IP and Subnetting).
  • Responsible for daily system administration of Linux and Windows servers. Also implemented HTTPD, NFS, SAN and NAS on Linux Servers.
  • Worked on creation of UNIX shell scripts to watch for 'null' files and trigger jobs accordingly and also had good knowledge in Python scripting language.
  • Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
  • Worked on disaster management for Hadoop cluster.
  • Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.

Environment: Hadoop CDH4, Hive, Sqoop, Pig, Oozie, Zookeeper, Ganglia, Cloudera Manager, Java, Linux (CentOS/REDHAT).

Confidential - San Bruno, CA

Hadoop Engineer

Responsibilities:

  • This Use case model involved handling large amount of user’s behavioral data from past i.e. historical data and current data - streaming from social platforms and web servers.
  • The datasets are partitioned by date/timestamps and loaded into Hive tables contributing performance efficiency and used compression codec’s to compress the data to increase storage efficiency.
  • Worked with Hive to perform analysis on the streamed log data, which includes user activities over social platforms and web sites to improve relevance.
  • Developed of custom MapReduce programs to perform parallel pattern search on the datasets to extract unique insights and deliver the most targeted audiences for advertisers.
  • Used Sqoop to integrate databases with Hadoop to import/export data.
  • Worked with DevOps team in Hadoop cluster planning and installation.
  • Worked on Hadoop cluster maintenance including data and metadata backups, file system checks, commissioning and decommissioning nodes and upgrades.
  • Experience working with HDFS, file system designed for storing very large files with streaming data access patterns, running on clusters of commodity hardware.
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
  • Performed Java Map Reduce programs on log data to transform into structured way to identify user location, age group and spending patterns.
  • Implemented proof of concept (POC) on Cassandra MySQL database.

Environment: Hadoop CDH4, Hive, Map Reduce, Flume, Sqoop.

Confidential - San Jose, CA

Linux Administrator

Responsibilities:

  • Supporting the core Linux environment. This includes the administration, design, documentation and troubleshooting of the core Linux server infrastructure, communications network, and software environment.
  • Responsible for monitoring critical usage of resources like CPU, RAM & Hard disks and also monitoring security logs.
  • Developing and maintaining automation systems for the Linux administrative efficiency improvement.
  • Participating in network management and maintenance. Configured local and network file sharing.
  • Within Unix/Linux environment, responsible for provisioning new servers, monitoring, automation, imaging, disaster recovery (planning and testing), scripting backup\recovery of bare metal hardware and virtual machines.
  • Managed load firewalls, balancers, and other networking equipment in a production environment.
  • Maintained data files, directory structure, monitor systems configuration and ensure data integrity.
  • Integrated monitoring, auditing, and alert systems for databases with existing monitoring infrastructure.
  • Responsible for setting up FTP, DHCP, DNS servers and Logical Volume Management.
  • Managed network fine-tuning, upgrades and enhances to optimize Network performance, availability, stability and security.
  • Provided authentication to users for Oracle databases.

Environment: Linux Systems (CentOS & Red hat), Oracle, DHCP, DNS, Logical Volume Manager, User Management.

Confidential

Sys Admin/DBA

Responsibilities:

  • Installing and maintaining the Red hat and Centos Linux servers.
  • Installed centos using Pre-Execution environment boot and kick-start method on multiple servers.
  • Responsible for performance tuning and troubleshooting Linux servers.
  • Running crontab to back up data.
  • Adding, removing, updating user account information, resetting passwords, etc.
  • Maintaining the SQL server and Authentication to required users for databases.
  • Applied Operating System updates, patches and configuration changes.
  • Used different methodologies to increase the performance and reliability of the IT infrastructure.
  • Responsible for System performance tuning and successfully engineered a virtual private network (VPN).
  • Installing, configuring and maintaining SAN and NAS storage.

Confidential

Linux System/Oracle Application DBA

Responsibilities:

  • Worked on Installation, configuration and upgrading of Oracle server software and related products.
  • Responsible for installation, administration and maintenance of Linux servers.
  • Established and maintain sound backup and recovery policies and procedures.
  • Take care of the Database design and implementation.
  • Implement and maintain database security (create and maintain users and roles, assign privileges).
  • Performed database tuning and performance monitoring.
  • Plan growth and changes (capacity planning).
  • Worked as part of a team and provide 7x24 support when required.
  • Performed general technical trouble shooting on trouble tickets to bring to resolution.
  • Interfaced with Confidential for technical support.
  • Patch Management and Version Control.

We'd love your feedback!