Hadoop Admin Resume
0/5 (Submit Your Rating)
El Segundo, CA
SUMMARY
- Over 8 years of experience in all aspects of System Administration, with special focus on applications, security, backups, and troubleshooting.
- Having 4 years of experience as a Linux administrator and more than 3 years of experience in Hadoop administering and Big data technologies (Hadoop HDFS, MapReduce, Pig, Hive, Oozie, Flume, Sqoop and Zookeeper)
- Worked extensively on Hadoop HDFS architecture and Map - Reduce framework.
- Experienced in installation, configuration, Deploying and managing Hadoop Clusters using Apache, CLOUDERA, MAPR, HORTONWORKS distributions.
- Solid experience in Pig and Hive administration and development.
- Supported in capacity planning for the Hadoop cluster in production.
- Experience in administering the Linux systems to deploy Hadoop cluster and monitoring the cluster using Nagios and Ganglia.
- Experience in benchmarking, performing backup and disaster recovery of Name Node metadata and important sensitive data residing on cluster.
- Familiar with importing and exporting data using Sqoop from RDBMS MySQL, Oracle, Teradata and having experience using fast loaders and connectors.
- Worked on developing ETL processes to load data from multiple data sources to HDFS using FLUME and SQOOP, perform structural modifications using Map-Reduce, HIVE and analyse data using visualization/reporting tools.
- Strong knowledge on work automation using Chef and Puppet.
- Experience in performance tuning for Mapreduce, Hive and Sqoop.
- Expertise in configuring Name Node High Availability and Name Node Federation.
- Strong knowledge on writing Oozie workflows and Job Controllers for job automation - shell, hive, scoop automation.
- Worked with system engineering team to plan and deploy Hadoop hardware and software environments.
- Worked on disaster management with Hadoop cluster.
- Extensively used MapReduce and Pig to build data transform framework.
- Hands on experience in Developing Hadoop cluster on Public and Private Cloud Environment like Amazon AWS, OpenStack.
- Good experience in deploying and managing the multi-node development, testing and production Hadoop cluster with different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, HBASE, ZOOKEEPER).
- Strong knowledge on new technologies like Spark, Kafka, Strom to catch up with industry developments
- Experience in Managing and reviewing Hadoop log files.
- Excellent working knowledge of HBase and data pre-processing using FLUME
- Hands-on experience with “Productionalizing” Hadoop applications such as administration, configuration management, debugging and performance tuning.
PROFESSIONAL EXPERIENCE
Confidential, EL Segundo, CA
Hadoop Admin
Responsibilities:
- Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
- Worked on configurations which Includes, networking and iptables, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
- Performed authentication and authorization service using Kerberos authentication protocol.
- Benchmarked Hadoop cluster using different bench marking mechanisms.
- Optimized Hadoop cluster performed by changing Hdfssit.xml, Coresite.xml, Mapredsite.xml.
- Deployed Network file system for NameNode Meta data backup.
- Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
- Configured and deployed hive metastore using MySQL and thrift server.
- Monitored clusters using Nagios and Ganglia.
- Installed and configuredSparkecosystem components SparkSQL,SparkStreamingand also worked on building BI reports in Tableau with Spark using SparkSQL.
- Implemented Spark using scala and SparkSQL for faster testing and processing of data.
- Performed Commissioning and decommissioning the DataNodes.
- Used Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
- Upgraded the Hadoop cluster from cdh3 - cdh4 & cdh4 - cdh5.1.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Development of Pig scripts for handling the raw data for analysis.
- Audited & Maintained and built new clusters for testing purposes using the Cloudera manager.
- Deployed and configured flume agents to stream log events into HDFS for analysis.
- Configured Oozie for workflow automation and coordination.
- Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
- Custom shell scripts for automating redundant tasks on the cluster.
- Installation & monitoring of MongoDB cluster.
- Upgrading MongoDB from 2.4 to 2.6 by implementing new security features.
- Worked with BI teams in generating the reports and designing ETL workflows on Pentaho.
- Involved in loading data from UNIX file system to HDFS.
- Defined Oozie workflow based on time to copy the data upon availability from different Sources to Hive.
- Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
Environment: MAP REDUCE, HDFS, HIVE, PIG, FLUME, SQOOP, UNIX SHELL SCRIPTING, NAGIOS, KERBEROS.
Confidential, Detroit, MI
Hadoop Engineer
Responsibilities:
- Installed Namenode, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node using Cloudera.
- Installed and configured Hortonworks Ambari for easy management of existing Hadoop cluster, Installed and Configured HDP.
- Installed and configured multi-nodes fully distributed Hadoop cluster of large number of nodes.
- Provided Hadoop, OS, Hardware optimizations.
- Setting up the machines with Network Control, Static IP, Disabled Firewalls, Swap memory.
- Understanding the performance bottlenecks by analyzing the existing hadoop cluster and provided performance tuning accordingly.
- Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
- Installed and configured Hadoop components Hdfs, Hive, HBase.
- Communicating with the development teams and attending daily meetings.
- Addressing and Troubleshooting issues on a daily basis.
- Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
- Cluster maintenance as well as creation and removal of nodes.
- Monitor Hadoop cluster connectivity and security.
- Manage and review Hadoop log files.
- Configured the cluster to achieve the optimal results by fine tuning the cluster.
- Dumped the data from one cluster to other cluster by using DISTCP, and automated the dumping procedure using shell scripts.
- Designed the shell script for backing up of important metadata and rotating the logs on a monthly basis.
- Implemented open source monitoring tool GANGLIA for monitoring the various services across the cluster.
- Testing, evaluation and troubleshooting of different NoSQL database systems and cluster configurations to ensure high-availability in various crash scenarios.
- Performance tuning and stress-testing of NoSQL database environments in order to ensure acceptable database performance in production mode.
- Designed the cluster so that only one secondary name node daemon could be run at any given time.
- Implemented commissioning and decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
- Dumped the data from HDFS to MYSQL database and vice-versa using SQOOP.
- Provided the necessary support to the ETL team when required.
- Integrated Nagios in the Hadoop cluster for alerts.
- Performed both major and minor upgrades to the existing cluster and also rolling back to the previous version.
Environment: LINUX, HDFS, MAPREDUCE, KDC, NAGIOS, GANGLIA, OOZIE, SQOOP, CLOUDERA MANAGER.
Confidential, Wallingford, CT
Linux/Database System Engineer
Responsibilities:
- Installed and Configured RHEL 2.1/3/4, Solaris 10 and Red hat on Intel and AMD hardware.
- Configured hands-free installation using Kickstart and PXE.
- Participated in upgrading and migrating 2.x and 3.x to 4.x and 5.0.
- Monitoring of Linux, Solaris servers using tools like vmstat, SAR, top, free etc.
- Performed virtualization on different types of servers.
- Experienced in administration of MVS.
- Install, Update and erase the required packages to provide services using rpm and up2date.
- Partitioning disks using Disk Druid, fdisk with raid options and multipathing with power path on SAN devices.
- Configured the scheduling of tasks using cron.
- Suggesting system Upgrades, planning & implementation.
- Enhanced performance using the ITIL technology.
- Designed and coordinated the system architecture.
- Providing administration and troubleshooting user related problems.
- Worked in user management, process management, software management and daemon management.
- Worked with NIST in providing security.
- Constant Monitoring of System Performance and managed file systems.
- Coordinating with vendors to solve the hardware and software related issues.
- Implemented Security by disabling unused services and using IPtables and TCP wrappers.
- Creating new file system, mount file system, monitoring free space, disk usage, locating files and checking & clearing of log files.
- Working with NOC (Network Operations Center).
- Administrative tasks such as System Startup/shutdown, Backup strategy, Printing, Documentation, User Management, Security and Network management.
- Provide responsive off-hours support in a 24 / 7 environment and ensure maximum availability of all servers and applications.
Environment: LINUX, NETWORKING, SECURITY, USER MANAGEMENT.
Confidential
Linux Administrator
Responsibilities:
- Administration of RHEL4.x, 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
- Creating, cloning Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts, Xen servers.
- Installing RedHat Linux using kick start and applying security polices for hardening the server based on the company policies.
- Managing systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning, testing.
- Worked and performed data-center operations including rack mounting, cabling.
- Installed and verified that all AIX/Linux patches or updates are applied to the servers.
- Installing, administering RedHat using Xen, KVM based hypervisors.
- RPM and YUM package installations, patch and other server management.
- Configuring multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
- Installing and configuring Apache and supporting them on Linux production servers.
- Troubleshooting Linux network, security related issues, capturing packets using tools such as IPtables, firewall, TCP wrappers, NMAP.
- Involved in testing of products and documentation of necessary changes required in this environment.
Environment: LINUX, CITRIX XEN SERVER 5.0