Sr. Hadoop Administrator Resume
5.00/5 (Submit Your Rating)
Charlotte, NC
SUMMARY
- 7+ years of professional IT experience which includes over 5 years of proven experience in Hadoop Administration in deploying, maintaining, monitoring and upgrading Hadoop Clusters using Hortonworks (HDP), Cloudera (CDH) Distributions.
- Experience in using Ambari, Cloudera Manager for installation and management of Hadoop Cluster
- Hands on experience using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, Spark, HBase, HDFS encryption, Zookeeper, Oozie, Hive, Tez, Sqoop.
- Experience in Importing and exporting data from different databases like SQL Server, Oracle into HDFS and Hive using Sqoop & Kafka.
- Good Experience in Multi Clustered environment and setting up Hadoop Distribution using underlying technologies like map/reduce framework, PIG, Oozie, HBase, HIVE, Sqoop, Spark, Kafka & related Java APIs for capturing, storing, managing, integrating & analyzing of data.
- Experience in understanding the security requirements for Hadoop and integrating with the Kerberos authentication infrastructure - KDC server setup, creating a realm /domain, managing.
- Strong knowledge in configuring High Availability for Hadoop Ecosystem.
- Knowledge of NoSQL databases such as HBase.
- Maintaining the availability of the cluster by troubleshooting the cluster issues and monitoring the cluster based on the alerts.
- Defining job flows in Hadoop environment using tools like Oozie for data scrubbing and processing.
- Having Good Knowledge in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS using EC2 and VPC.
- Experienced in Integrated BI and Analytical tools like Tableau, and Business Objects with Hadoop Cluster.
- Experience in upgrading the existing hadoop cluster to latest release.
- Secured Solr collections through Solr, Kerberos and Ranger implementation.
TECHNICAL SKILLS
- Operating Systems: Redhat Linux, Windows, centos
- Big Data: Hive, Pig, Hbase, Sqoop, Spark, Hbase, flume, YARN, Ranger, SOLR, Zookeeper
- Hadoop Distributions: HDP(2.2,2.3, 2.5), Cloudera
- Database: MYSQL &NoSQL, SQL Server, Oracle, HBASE(NoSQL)
- Scripting Languages: Unix, Shell scripting & Python
- Monitoring Tools: Ambari, Cloudera Manager, Nagios, and Ganglia, Apache Active Directory.
- BI Tools: Cognos8.4, Tableau, 7.x/8.x Suite, Report Net
- Methodologies: Agile, Waterfall Model
- Cloud Technologies: AWS, EC2, VPC, IAM, S3
PROFESSIONAL EXPERIENCE
Confidential, Charlotte, NC
Sr. Hadoop Administrator
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Experienced in Installing, Configuring, Monitoring, Maintaining and Troubleshooting Hadoop clusters.
- Experience in setting up Hortonworks cluster and installing all the ecosystem components through Ambari.
- Extensively involved in Cluster Capacity planning, Hardware planning, and Performance Tuning of the Hadoop Cluster.
- Extensively worked with Hortonworks Distribution of Hadoop, HDP 2.3, 2.5, 2.6 and 3.0
- Performed minor upgrades(2.5 to 2.6) and major HDP upgrades(2.6 to 3.0)
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Hands on experience working with HDFS, MapReduce, Hive, Pig, Sqoop, Impala, Hadoop HA, Yarn, Hue
- Monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures
- Deployed high availability on the Hadoop cluster quorum journal nodes.
- Configured NameNode high availability and NameNode federation.
- Involved in collecting and aggregating large amounts of streaming data and every node logs into HDFS using Flume.
- Monitoring and controlling local file system disk space usage, log files, cleaning log files with automated scripts. Integrated different tools like SAS, Tableau, with hadoop, this way users can pull data from HDFS hive.
- Loaded the dataset into Hive for ETL Operation
- Implemented Capacity Scheduler to share the resources of the cluster for the MapReduce jobs given by the users.
- Enabled Kerberos authentication for the cluster and responsible for generating keytab files for both the user accounts and service accounts.
- Worked on Integration of Hiveserver2 with BI tools.
- Managing and scheduling Jobs on a Hadoop cluster using Oozie.
- Used Sqoop to import and export data from RDBMS to HDFS and vice-versa.
Environment: Cloudera, Flume, Kafka, Pig, Oozie, Hive, Sqoop, Impala, Kerberos, UNIX Shell Scripts, Python, Zoo Keeper, SQL, Map Reduce.
Confidential, Charlotte, NC
Sr. Hadoop Consultant
Responsibilities:
- Hands on experience in Installing, Upgrade and maintain Hadoop clusters with Apache & Hortonworks Hadoop Ecosystem components such as Sqoop, Hbase and MapReduce.
- Involved in managing Hadoop infrastructure like adding capacity, load balancing.
- Involved in clustering of Hadoop in the network of 70 nodes.
- Experience in managing the cluster resources by implementing fair and capacity scheduler with ACL enabled.
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Involved in developing new work flow Map Reduce jobs using Oozie framework.
- Collected the logs data from web servers and integrated in to HDFS using Flume.
- Involved in Monitoring and support through Nagios and Ganglia.
- Enabled Name node HA with auto Fail over.
- Managing the configuration of the clusters to meet the needs of analysis whether I/O bound or CPU bound.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
- Automated workflows using shell scripts to pull data from various databases into Hadoop.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
- Flume configuration for the transfer of data from the webservers to the HDFS.
- Performing benchmark test on Hadoop clusters and tweak the solution based on test results.
Environment: RHEL, CentOS, Ubuntu, Apache Hadoop, Hortonworks, HDFS, Map, Reduce, Hbase, Shell Scripts. Nagios, Ganglia, AWS
Confidential, Saline MI
Hadoop Administrator
Responsibilities:
- Involved in setting up Cloudera clusters for development and Production Environments.
- Installed and Configured Hive, Pig, Sqoop, Flume, Cloudera Manager and Oozie on the hadoop cluster.
- Involved in the major upgrade in development environment from CDH 4 to CDH 5
- Worked with big data developers, designers and scientists in troubleshooting mapreduce, hive jobs and tuned them to give high performance.
- Proactively involved in ongoing maintenance, support, and improvements in Hadoop cluster.
- Design and develop ETL workflow using Oozie which includes automating the extraction of data from different databases into HDFS using Sqoop scripts.
- Worked on upgrading Hadoop cluster from current version to minor version upgrade as well as to major versions.
- Installing, monitoring, managing, troubleshooting, applying patches in different environments such as Development, Sandbox and Production environments.
- Worked with different file formats and compression techniques in hadoop.
- Experience in working with various Hadoop components such as HDFS, Job Tracker, Task Tracker, Name Node, Data node and Mapreduce
Environment: Hadoop HDFS, Mapreduce, Hive, PIG, Oozie, Sqoop, Ambari, Nagios & Ganglia, Red hat Linux.