Hadoop Administrator Resume Sanjose, CA - Hire IT People

SUMMARY

7+ years of software development experience, which includes 3 plus years on Big Data technologies using Hadoop ecosystems like Hive, Mapreduce, Pig, Sqoop, Oozie, Flume and 4 years in the Linux Administration.
In - depth expertise in deploying large-scale, multi cluster Hadoop implementations.
Acute knowledge on Hadoop HDFS architecture and MapReduce framework.
Experience working with Hive data warehouse systems, extending the Hive library to query data in custom formats.
Worked on high volume streaming data processing using FLUME.
Worked on developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using MapReduce, HIVE and analyze data using visualization/reporting tools.
Installation and configuration of Name Node High Availability (NNHA) using Zookeeper.
Knowledge in deploying scalable Hadoop cluster on Cloud environments like Amazon AWS, Rack-Space and Amazon S3 & S3N underlying file system for Hadoop.
Experience in developing applications using web technologies.
Good knowledge on Hadoop 2.0 (YARN).
Experienced in providing security to Hadoop cluster with Kerberos.
Hands on experience on Unix/Linux environments, which included software installations/upgrades, shell scripting for job automation and other maintenance activities.
Working experience with large scale Hadoop environments, which includes planning, designing, installation and configuration, set up. Provided guidelines for best practices, benchmarking and performance tuning.
Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
Experience working 24/7 on-call schedules supporting large-scale big data implementations for troubleshooting and issue resolution.
Possess excellent analytical, problem solving, organizational and interpersonal skills. Self-motivated attitude with the ability to move cleanly from theoretical to implementation thinking.
Acute knowledge of hardware, software, networking, applications and change management processes.
Self-starter and team player, capable of working independently and motivating a team of professionals.

TECHNICAL SKILLS

Hadoop Ecosystem: Hive, Pig, Zookeeper, Map-Reduce, Sqoop, Ganglia, Nagios, Cloudera Manager, Flume, Oozie

Operating Systems: Linux (CentOS & Red hat), MS-Windows NT/Server 2003MS-DOS, UNIX

Languages: C, Java, SQL, UNIX Shell Scripting

Databases: Oracle, DB2, MS SQL Server, MySQL

PROFESSIONAL EXPERIENCE

Confidential - Sanjose, CA

Hadoop Administrator

Responsibilities:

Use CFEngine for configuring the Linux boxes and develop Python scripts for managing important files and automating important tasks.
Part of a team responsible for upgrading one of the clusters from CDH3 to CDH4 with High Availability.
Responsible for performance tuning and benchmarking Hadoop cluster.
Managing and monitoring the Hadoop cluster and platform infrastructure, recovering from node failures and troubleshooting common Hadoop cluster issues.
Worked closely with Big Data Architect and Development team for implementing and administering new components like Hue and Oozie.
Documenting environments, processes, and procedures for reference and/or training purposes.
Helping the development team in solving their Hadoop issues and tweaking configuration for better performance and reliability.
Improved Hadoop monitoring by replacing Ganglia with Zabbix to bring it in line with their current monitoring standards.
Created run books for alerts with a severity level of average or higher.
Created iptables rules to filter the traffic coming into the Linux systems.

Environment: Python, CFEngine, Zabbix API, Flume, Zookeeper, Pig, Hive, Oozie, Hue, GitHub

Confidential - San Diego, CA

Hadoop Administrator

Responsibilities:

Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, and backup & DR systems.
Involved in analyzing system failures, identifying root-cause and recommendation of remediation actions. Documented issue log with solutions for future references.
Worked with systems engineering team for planning new Hadoop environment deployments, expansion of existing Hadoop clusters.
Monitored multiple hadoop clusters environments using Ganglia and Nagios. Monitoring workload, job performance and capacity planning using Cloudera Manager.
Worked with application teams to install OS level updates, patches and version upgrades required for Hadoop cluster environments.
Installed and configured Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
Installation and configuration of Name Node High Availability (NNHA) using Zookeeper.
Created local YUM repository for installing and updating packages.
Analyzed web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
Experienced in Linux Administration tasks like IP Management (IP Addressing, Ethernet Bonding, Static IP and Subnetting).
Responsible for daily system administration of Linux and Windows servers. Also implemented HTTPD, NFS, SAN and NAS on Linux Servers.
Worked on creation of UNIX shell scripts to watch for 'null' files and trigger jobs accordingly and also had good knowledge in Python scripting language.
Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs and shell scripts).
Worked on disaster management for Hadoop cluster.
Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.

Environment: Hadoop CDH4, Hive, Sqoop, Pig, Oozie, Zookeeper, Ganglia, Cloudera Manager, Java, Linux (CentOS/REDHAT).

Confidential - San Bruno, CA

Hadoop Engineer

Responsibilities:

This Use case model involved handling large amount of user’s behavioral data from past i.e. historical data and current data - streaming from social platforms and web servers.
The datasets are partitioned by date/timestamps and loaded into Hive tables contributing performance efficiency and used compression codec’s to compress the data to increase storage efficiency.
Worked with Hive to perform analysis on the streamed log data, which includes user activities over social platforms and web sites to improve relevance.
Developed of custom MapReduce programs to perform parallel pattern search on the datasets to extract unique insights and deliver the most targeted audiences for advertisers.
Used Sqoop to integrate databases with Hadoop to import/export data.
Worked with DevOps team in Hadoop cluster planning and installation.
Worked on Hadoop cluster maintenance including data and metadata backups, file system checks, commissioning and decommissioning nodes and upgrades.
Experience working with HDFS, file system designed for storing very large files with streaming data access patterns, running on clusters of commodity hardware.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
Performed Java Map Reduce programs on log data to transform into structured way to identify user location, age group and spending patterns.
Implemented proof of concept (POC) on Cassandra MySQL database.

Environment: Hadoop CDH4, Hive, Map Reduce, Flume, Sqoop.

Confidential - San Jose, CA

Linux Administrator

Responsibilities:

Supporting the core Linux environment. This includes the administration, design, documentation and troubleshooting of the core Linux server infrastructure, communications network, and software environment.
Responsible for monitoring critical usage of resources like CPU, RAM & Hard disks and also monitoring security logs.
Developing and maintaining automation systems for the Linux administrative efficiency improvement.
Participating in network management and maintenance. Configured local and network file sharing.
Within Unix/Linux environment, responsible for provisioning new servers, monitoring, automation, imaging, disaster recovery (planning and testing), scripting backup\recovery of bare metal hardware and virtual machines.
Managed load firewalls, balancers, and other networking equipment in a production environment.
Maintained data files, directory structure, monitor systems configuration and ensure data integrity.
Integrated monitoring, auditing, and alert systems for databases with existing monitoring infrastructure.
Responsible for setting up FTP, DHCP, DNS servers and Logical Volume Management.
Managed network fine-tuning, upgrades and enhances to optimize Network performance, availability, stability and security.
Provided authentication to users for Oracle databases.

Environment: Linux Systems (CentOS & Red hat), Oracle, DHCP, DNS, Logical Volume Manager, User Management.

Confidential

Sys Admin/DBA

Responsibilities:

Installing and maintaining the Red hat and Centos Linux servers.
Installed centos using Pre-Execution environment boot and kick-start method on multiple servers.
Responsible for performance tuning and troubleshooting Linux servers.
Running crontab to back up data.
Adding, removing, updating user account information, resetting passwords, etc.
Maintaining the SQL server and Authentication to required users for databases.
Applied Operating System updates, patches and configuration changes.
Used different methodologies to increase the performance and reliability of the IT infrastructure.
Responsible for System performance tuning and successfully engineered a virtual private network (VPN).
Installing, configuring and maintaining SAN and NAS storage.

Confidential

Linux System/Oracle Application DBA

Responsibilities:

Worked on Installation, configuration and upgrading of Oracle server software and related products.
Responsible for installation, administration and maintenance of Linux servers.
Established and maintain sound backup and recovery policies and procedures.
Take care of the Database design and implementation.
Implement and maintain database security (create and maintain users and roles, assign privileges).
Performed database tuning and performance monitoring.
Plan growth and changes (capacity planning).
Worked as part of a team and provide 7x24 support when required.
Performed general technical trouble shooting on trouble tickets to bring to resolution.
Interfaced with Confidential for technical support.
Patch Management and Version Control.

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Sanjose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship