Sr. Hadoop Administrator Resume
Austin, TX
SUMMARY
- Overall 8+ years of professional Information Technology experience in Hadoopand Linux Administration activities such as installation, configuration and maintenance of systems/clusters.
- Having extensive experience in Linux Administration & Big Data Technologies as a Hadoop Administration.
- Hands on experience in HadoopClusters using Hortonworks (HDP), Cloudera (CDH3, CDH4), oracle big data and Yarn distributions platforms.
- Possessing skills in Apache Hadoop, Map - Reduce, Pig, Impala, Hive, Hbase, Zookeeper, Sqoop, Flume, OOZIE, Kafka, storm, Spark, Java Script, and J2EE.
- Experience in deploying and managing the multi-node development and production Hadoopcluster with different Hadoop components (Hive, Pig, Sqoop, Oozie, Flume, HCatalog, Hbase, Zookeeper) using Hortonworks Ambari.
- Good experience in creating various database objects like tables, stored procedures, functions, and triggers using SQL, PL/SQL and DB2.
- Used Apache Falcon to support Data Retention policies for HIVE/HDFS.
- Experience in Configuring Name-node High availability and Name-node Federation and depth knowledge on Zookeeper for cluster coordination services.
- Experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
- Experience in administering Tableau and Green Plum databases instances in various environments.
- Experience in administration of Kafkaand Flume streaming using Cloudera Distribution.
- Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
- Extensive knowledge in Tableau on Enterprise Environment and Tableau administrationexperience including technical support, troubleshooting, reporting and monitoring of system usage.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Worked on NoSQL databases including Hbase, Cassandra and MongoDB.
- Designing and implementing security for Hadoop cluster with Kerberos secure authentication.
- Hands on experience on Nagios and Ganglia tool for cluster monitoring system.
- Experience in scheduling all Hadoop/Hive/Sqoop/Hbase jobs using Oozie.
- Knowledge of Data Ware Housing concepts and Cognos 8 BI Suit and Business Objects.
- Experience in HDFS data storage and support for running map-reduce jobs.
- Experience in Installing Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems.
- Expert in Linux Performance monitoring, kernel tuning, Load balancing, health checks and maintaining compliance with specifications.
- Hands on experience in Zookeeper and ZKFC in managing and configuring in NameNode failure scenarios.
- Team Player with good communication and interpersonal skills and goal oriented approach to problem solving issues.
TECHNICAL SKILLS
Big Data Technologies: Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Hbase, Flume, Oozie, Spark, Zookeeper.
Hadoop Platforms: Hortonworks and Cloudera, Apache Hadoop.
Networking Concepts: OSI Model, TCP/IP, UDP, IPV4, DHCP & DNS.
Programming Languages: Java, UNIX shell scripting and Bash.
Operating Systems: Linux (CentOS, Ubuntu, Red Hat), Windows, UNIX and Mac OS-X.
Database/ETL: Oracle, Cassandra, DB2, MS-SQL Server, MySQL, MS-Access, Hbase, MongoDB, Teradata.
XML Languages: XML, DTD, XML Schema, XPath.
Monitoring and Alerting: Nagios, Ganglia, Cloudera Manager, Ambari.
PROFESSIONAL EXPERIENCE
Sr. Hadoop Administrator
Confidential, Austin, TX
Responsibilities:
- Involved in start to end process of Hadoop cluster setup over Cloudera manager where in installation, configuration and monitoring the Hadoop Cluster.
- Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & review Hadoop log files.
- Monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml based upon the job requirement.
- Built data platforms, pipelines, and storage systems using the Apache Kafka, Apache Storm and search technologies such as Elastic search.
- Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing zookeeper.
- Multi-node Hadoop Clusters on public cloud environment Amazon Web Services (AWS) - EC2 and on private cloud infrastructure.
- Involved in cluster capacity planning, Hardware planning, Installation, Performance tuning of the Hadoop cluster.
- Used Apache Falcon for Data Retention policies for Hive/HDFS.
- Loading log data directly into HDFS using Flume.
- Exported analyzed data to HDFS using Sqoop for generating reports.
- Worked on Oozie workflow engine to run multiple Map Reduce jobs.
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future references.
- Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.
- Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
- Involved in Installing and configuring Kerberos for the authentication of users and Hadoop daemons.
Environment: Hadoop, HDFS, Hive, Sqoop, Flume, Kafka, Zookeeper and HBase, Oracle 9i/10g/11g RAC with Solaris/Redhat, Exadata Machines X2/X3, Big Data Cloudera CDH Apache Hadoop, Toad, SQL plus, Oracle Enterprise Manager (OEM), RMAN, Shell Scripting, Golden Gate, Red Hat/Suse Linux, EM Cloud Control.
Hadoop Administrator
Confidential, Los Angeles, CA
Responsibilities:
- Working onHadoopHortonworks (HDP 2.6.0.2.2) distribution which managed services viz. HDFS, MapReduce2, Hive, Pig, Hbase, Sqoop, Flume, Spark, Ambari Metrics, Zookeeper, Falcon and oozie etc. for 4 cluster ranges from LAB, DEV, QA to PROD.
- MonitorHadoopcluster connectivity and security on Ambari monitoring system.
- Led the installation, configuration and deployment of product software’s on new edge nodes that connect and contactHadoopcluster for data acquisition.
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
- Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
- Interacting with HDP support and log the issues in portal and fixing them as per the recommendations.
- Using Flume and Spool directory loading the data from local system to HDFS.
- Retrieved data from HDFS into relational databases with Sqoop.
- Extending the functionality of Hive and Pig with custom UDF s and UDAF's.
- Worked on analyzingHadoopcluster and different big data analytic tools including Pig, Hbase database and Sqoop.
- Commissioned and Decommissioned nodes onHadoopcluster on Red Hat LINUX.
- Involved in loading data from LINUX file system to HDFS.
- Experience in configuring the Storm in loading the data from MYSQL to HBASE using jms.
- Integrated Kerberos intoHadoopto make cluster more strong and secure from unauthorized users.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Environment: Hadoop, HDFS, MapReduce, HIVE, PIG, FLUME, OOZIE, Sqoop, Eclipse, Hortonworks, Ambari, RedHat, MYSQL.
Hadoop Administrator
Confidential - Houston, TX
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Worked on Providing User support and application support through remedy ticket management system on Hadoop Infrastructure.
- Installed and configured Hadoop cluster in Development, Testing and Production environments.
- Performed both major and minor upgrades to the existing CDH cluster.
- Installation of various Hadoop Ecosystems and Hadoop Daemons.
- Installed and configured Flume agents with well-defined sources, channels and sinks.
- Configured safety valve to create active directory filters to sync the LDAP directory for Hue.
- Understanding the existing Enterprise data warehouse set up and provided design and architecture suggestion converting to Hadoop using MapReduce, HIVE, SQOOP and Pig Latin.
- Implemented Name Node backup using NFS. This was done for High availability.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Wrote shell scripts for rolling day-to-day processes and it automated using crontab.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Implemented FIFO schedulers on the Job tracker to share the resources of the Cluster for the MapReduce jobs given by the users.
- Involved in Data model sessions to develop models for HIVE tables.
Environment: Apache Hadoop, CDH4, Hive, Hue, Pig, Hbase, MapReduce, Sqoop, RedHat, CentOS and Flume.
Jr. Hadoop Administrator
Confidential
Responsibilities:
- Installed, configured and maintained Apache Hadoopclusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Managing and scheduling Jobs on HadoopClusters using Apache, Cloudera (CDH3, CDH4) distributions.
- Worked on importing and exporting data from Oracle and DB2 into HDFS using Sqoop.
- Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
- Implemented NameNode backup using NFS. This was done for High availability.
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Configured custom interceptors in Flume agents for replicating and multiplexing data into multiple sinks.
- Worked on NoSQL databases including Hbase and MongoDB.
- Setting up automated 24*7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Ganglia.
- Extensive experience in data analysis using tools like sync sort and HZ along with Shell Scripting and UNIX.
- Handle the data exchange between HDFS& Web Applications and databases using Flume and Sqoop.
- Experienced in developing MapReduce programs using Apache Hadoopfor working with Big Data.
- Expertise in working with different databases like Oracle, MS-SQL Server, PostgreSQL and MS Access 2012 along with exposure to Hibernate for mapping an object-oriented domain model to a traditional relational database.
- Good understanding of Scrum methodologies, Test Driven Development and continuous integration.
Environment: Cloudera, CDH 4.4, and CDH 3, Cloudera manager, Sqoop, Flume, Hive, HQL, Pig, RHEL, Cent OS, Oracle, MS-SQL, Zookeeper, Oozie, MapReduce, Apache Hadoop 1.x, PostgreSQL, Ganglia and Nagios.
Linux Administrator
Confidential
Responsibilities:
- Installing / configuring RHEL (Redhat Enterprise Linux)
- Supporting WebLogic servers and deployments
- Supporting VMware installations
- Preparing Operational Run books
- Supporting Open Text CMS architecture
- Redhat Package Management
- Tomcat / JBOSS installation and support
- WebSeal and DB2 - Basic knowledge
- Hands on experience on HP Data protector backup / restore utility
- Search engine support of Verity and Google analytics
- Installing / Supporting Jira / confluence
- Supporting Linux environment for disaster recovery environment
- Installing and maintaining the Linux servers.
- Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers.
- Monitoring System Metrics and logs for any problems.
- Running Cron-tab to back up data.
- Maintaining the MYSQL server and Authentication to required users for databases.
- Creating and managing Logical volumes.
- Installing and updating packages using YUM.
Environment: MYSQL, PHP, DB2, APACHE, MYSQL WORKBENCH, TOAD, LINUX.
Linux/Systems Administrator
Confidential
Responsibilities:
- Experience with Linux internals, virtual machines, and open source tools/platforms.
- Improve system performance by working with the development team to analyze, identify and resolve issues quickly.
- Ensured data recoverability by implementing system and application level backups.
- Performed various configurations which include networking and IP Tables, resolving hostnames, SSH key less login.
- Managed CRONTAB jobs, batch processing and job scheduling.
- Networking service, performance, and resource monitoring.
- Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
- Support pre-production and production support teams in the analysis of critical services and assists with maintenance operations.
- Automate administration tasks through use of scripting and Job Scheduling using CRON.
- Performance tuning for high transaction and volumes data in mission critical environment.
- Setting up alert and level for MySQL (uptime, Users, Replication information, Alert based on different query).
- Estimate MySQL database capacities; develop methods for monitoring database capacity and usage.
- Develop and optimize physical design of MySQL database systems.
- Support in development and testing environment to measure the performance before deploying to the production.
Environment: MYSQL 5.1.4, PHP, SHELL SCRIPT, APACHE, MYSQL WORKBENCH, TOAD, LINUX 5.0, 5.1.