Sr. Cloudera/ Hadoop Administrator Resume St. Petersburg, FL - Hire IT People

SUMMARY:

Over 6 years of professional IT experience which includes around 5+ years of hands on experience in Hadoop using Cloudera, Hortonworks, and Hadoop working environment includes Map Reduce, HDFS, HBase, Zookeeper, Oozie, Hive, Sqoop, Pig, Spark and Flume.
Experience on Hadoop distribution like, Cloudera and Hortonworks of Hadoop.
Experience with implementing High Availability for HDFS, Yarn, Hive and Hbase.
Knowledge in job workflow scheduling and monitoring tools like Oozie and Zookeeper
Experience in configuring AWS EC2, S3, VPC, RDS, Azure, Cloud Formation, CloudTrail, IAM, and SNS.
Worked on Hadoop security and access controls (Kerberos, Active directory, LDAP).
Experience in performance tuning of Map Reduce, Pig jobs and Hive queries.
Experience in deploying Hadoop cluster on Public and Private Cloud Environment like Amazon AWS.
Worked on NoSQL databases including HBase and MongoDB.
Experience in migrating on premise to Windows Azure using Azure Site Recovery and Azure backups
Strong knowledge in configuring High Availability for Name Node, Data Node,Hbase, Hive and Resource Manager.
Experienced in Talend for big data integration.
Maintained the user accounts (IAM), RDS, Route 53, VPC, RDB, Dynamo DB, SES, SQS and SNS services in AWS cloud.
Good understanding in Deployment of Hadoop Clusters using Automated Puppet scripts.
Experience in designing and implementation of secure Hadoop cluster using MIT and AD Kerberos, Apache Sentry, Knox and Ranger.
Monitor Hadoop cluster using tools like Nagios, Ganglia, Ambari and Cloudera Manager.
Experienced in loading data from the different data sources like (Teradata and DB2) into HDFS using Sqoop and load into partitioned Hive tables.
Experience in administration of Kafka and Flume streaming using Cloudera Distribution
Hands on experience on Unix/Linux environments, which included software installations/upgrades, shell scripting for job automation and other maintenance activities.
Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring on Linux systems.
Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.

PROFESSIONAL EXPERIENCE:

Confidential, St. Petersburg, FL

Sr. Cloudera/ Hadoop Administrator

Responsibilities:

Responsible for installing, configuring, supporting and managing of Cloudera Hadoop Clusters.
Analyze development activities done by BIGDATA team and provide support.
Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes.
Worked on Spark issues.
Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
Created MapR DB tables and involved in loading data into those tables.
Working on placing analytics sandbox on Azure
Maintaining the Operations, installations, configuration of 100+ node clusters with MapR distribution.
Installed and configured Cloudera CDH 5.7.0 REHL 5.7, 6.2, 64 - bit Operating System and responsible for maintaining cluster.
Used sqoop to pull the data from the Netezza database and Hive to push the data.
Helped on storage volume failures on HADOOP Clusters.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Experience with Cloudera Navigator and Unravel data for Auditing hadoop access.
Cloudera Navigator installation and configuration using Cloudera Manager.
Cloudera RACK awareness and JDK upgrade using Cloudera manager.
Sentry installation and configuration for Hive authorization using Cloudera manager.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
Experience in setup, configuration and management of security for Hadoop clusters using Kerberos and integration with LDAP/AD at an Enterprise level.
Used Hive and created Hive tables, loaded data from Local file system to HDFS.
Setting up test cluster with new services like Grafana and integrating with Kafka and Hbase for intense monitoring.
Worked with Spark for improving performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames, and Pair RDD's.
Responsible for copying 400 TB of HDFS snapshot from Production cluster to DR cluster.
Responsible for copying 210 TB of Hbase table from Production to DR cluster.
Created SOLR collection and replicas for data indexing.
Worked on data ingestion through Kafka.
Worked with Netezza integration with AZURE data lake.
Administering 150+ Hadoop servers which need java version updates, latest security patches, OS related upgrades and taking care of hardware related outages.
Upgraded Ambari 2.2.0, Ambari 2.4.2.0. SOLR update from 4.10.3 to Ambari INFRA which is SOLR 5.5.2.
Implemented Cluster Security using Kerberos and HDFS ACLs.
Involved in Cluster Level Security, Security of perimeter (Authentication- Cloudera Manager, Active directory and Kerberoes) Access (Authorization and permissions- Sentry) Visibility (Audit and Lineage - Navigator) Data (Data Encryption at Rest).
Experience in setting up Test, QA, and Prod environment. Written Pig Latin Scripts to analyze and process the data.
Involved in loading data from UNIX file system to HDFS. Created root cause analysis (RCA) efforts for the high severity incidents.
Investigate the root cause of Critical and L2/L2 tickets.

Environment: Cloudera, Apache Hadoop, HDFS, YARN, Cloudera Manager, Sqoop, Flume, Oozie, Zookeeper, Kerberos, Sentry, AWS, Pig, Spark, Hive, Docker, Hbase, Python, LDAP/AD, NOSQL, Golden Gate, EM Cloud Control, Exadata Machines X2/X3, Toad, MySQL, PostgreSQL, Teradata.

Confidential, San Jose, CA

Sr. Hadoop Administrator

Responsibilities:

Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
Installed single Node machines for stake holders with Hortonworks HDP Distribution.
Worked on a live 110 node Hadoop Cluster running Hortonworks Data Platform (HDP 2.2).
Played responsible role for deciding the hardware configurations for the cluster along with other teams in the company.
Regular Maintenance of Commissioned/decommission nodes as disk failures occur using MapR File System.
Implemented MapR token based security.
Resolving tickets submitted, P1 issues, troubleshoot the error, and documenting resolved errors.
Adding new Data Nodes when needed and running balancer.
Introduced SmartSense and got optimal recommendations from the vendor and even for troubleshooting the issues.
Configured the Kerberos and installed MIT ticketing system.
Secured the Hadoop cluster from unauthorized access by Kerberos, LDAP integration and TLS for data transfer among the cluster nodes.
Installing and configuring CDAP, an ETL tool in the development and Production clusters.
Integrated CDAP with Ambari to for easy operations monitoring and management.
Used CDAP to monitor the datasets and workflows to ensure smooth data flow.
Connected to the HDFS using the third-party tools like Teradata SQL assistant using ODBC driver.
Responsible for building scalable distributed data solutions using Hadoop.
Migrated Hive QL queries on structured data into Spark QL to improve performance.
Responsible for Cluster maintenance. Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
Resolving tickets submitted by users, P1 issues, troubleshoot the error documenting, resolving the errors.

Environment: Over 110 nodes, approximately 5 PB of data, Hortonworks, HA name node, map reduce, Yarn, HiveImpala, Pig, Sqoop, Flume, Oozie, Hue, White elephant, Ganglia, Nagios, HBase, Cassandra, StormCobbler.

Confidential, Hillsboro, OR

Hadoop Administrator

Responsibilities:

Used Cloudera distribution for Hadoop ecosystem. Converted MapReduce jobs into Spark transformations and actions using Spark.
Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS and store in databases such as HBase.
Load data from various data sources into HDFS using Flume.
Worked on Cloudera to analyze data present on top of HDFS.
Restoring and Migrating Cloudera using Cloudera Manager Tools.
Worked on large sets of structured, semi-structured and unstructured data.
Experience in Installing Name Node High availability deploying Hadoop Understands How quires run in Hadoop.
Use of Sqoop to import and export data from HDFS to RDBMS and vice-versa.
Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map, reduce way.
Participated in design and development of scalable and custom Hadoop solutions as per dynamic data needs.
Experience in setup, configuration and management of security for Hadoop clusters using Kerberos.
Handled the imports and exports of data onto HDFS using Flume and Sqoop.
Migrating Name node from one server to another server.
Hive backup and Disaster recovery using Cloudera backup tools.
HDFS data backup and Disaster recovery using Cloudera BDR.
Supported technical team members in management and review of Hadoop log files and data backups.
Formulated procedures for installation of Hadoop patches, updates and version upgrades.

Environment: HDFS, Cloudera, MapReduce, JSP, JavaBean, Pig, Hive, Sqoop, Flume, Oozie, HBase, Kafka, ImpalaSpark Streaming, Storm, Yarn, Eclipse, Unix Shell Scripting.

Confidential - Philadelphia, PA

Hadoop Admin

Responsibilities:

The project plan is to build and setup Big data environment and support operations, effectively manage and monitor the Hadoop cluster through Cloudera Manager.
Installed, Configured and Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Involved in start to end process of Hadoop cluster setup where in installation, configuration and monitoring the Hadoop Cluster in Cloudera.
Installed and configured CDH 5.3 cluster using Cloudera Manager.
Build the applications using Maven and Jenkins Integration Tools.
Involved in the process of data modeling Cassandra Schema
Successfully upgraded Hortonworks Hadoop distribution stack from 2.3.4 to 2.5.
Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
Managed and reviewed Hadoop Log files.
Prepared documentation about the Support and Maintenance work to be followed in Talend.
Worked on Installing and configuring the HDP Hortonworks 2.x Clusters in Dev and Production Environments.
Experience with Cloudera Navigator and Unravel data for Auditing Hadoop access.
Involved in creating Spark cluster in HDInsight by create Azure compute resources with Spark installed and configured.
Worked in ETL tools like Talend to simplify Map Reduce jobs from the front end.
Installing, configuring and administering Jenkins Continuous Integration (CI) tool on Linux machines along with adding/updating plugins such as SVN, GIT, Maven, ANT, Chef, Ansible etc.
Used Kafka for building real-time data pipelines between clusters.
Installed and configured Hive with remote Metastore using MySQL.
Optimized the Cassandra cluster by making changes in Cassandra properties and Linux (Red Hat) OS configurations.
Developed shell scripts along with setting up of CRON jobs for monitoring and automated data backup on Cassandra cluster.
Pro-actively monitored systems and services and implementation of Hadoop Deployment, configuration management, performance, backup and procedures.
Designed messaging flow by using Apache Kafka.
Implemented Kerberos based security for clusters.
Monitored the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Configuring, Maintaining, and Monitoring Hadoop Cluster using Apache Ambari, Hortonworks distribution of Hadoop.
Worked on Recovery of Node failure.
Add additional users to GIT repository when the owner request for it.
Managed and scheduling Jobs on a Hadoop cluster.
Monitoring local file system disk space usage, CPU using Ambari.
Experience in developing programs in Spark using Python to compare the performance of Spark with Hive and SQL/Oracle.
Performed Puppet, Kibana, Elastic Search, Talend, Red Hat infrastructure for data ingestion, processing, and storage.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, Cassandra and slots configuration.
Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
Secured the Hadoop cluster from unauthorized access by Kerberos, LDAP integration and TLS for data transfer among the cluster nodes.
Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non secured cluster to secured cluster.
Handle any casting issue from Big Query itself, so selecting from the table just written and handling manually any casting.
Responsible for upgrading Hortonworks Hadoop HDP 2.4.2 and MapReduce 2.0 with YARN in Multi Clustered Node environment.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Used Oozie scripts for deployment of the application and perforce as the secure versioning software.
Extensively worked on configuring NIS, NIS+, NFS, DNS, DHCP, Auto mount, FTP, Mail servers.
Installed and configured Kerberos for the authentication of users and Hadoop daemons.
Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
Addressed Data Quality Using Informatica Data Quality (IDQ) tool.
Experience in designing data models for databases and Data Warehouse/Data Mart/ODS for OLAP and OLTP environments
Worked with support teams to resolve performance issues.
Worked on testing, implementation and documentation.

Environment: HDFS, MapReduce, Big Query, Apache Hadoop, Cloudera Distributed Hadoop, Hbase, Hive, Flume, Sqoop, RHEL, Python, MySQL.

Confidential

Linux/ System Admin

Responsibilities:

Worked on Administration of RHEL 4.x and 5.x which includes installation, testing, tuning, upgrading and loading patches, troubleshooting both physical and virtual server issues.
Installing, Upgrading and applying patches for UNIX, Red Hat/ Linux, and Windows Servers in a clustered and non-clustered environment.
Troubleshoot NIS, NFS, DNS and other network issues, Create dump files, backups.
Created and cloned Linux Virtual Machines, templates using VMware Virtual Client 3.5 and migrated servers between ESX hosts and Xen servers.
Installed RedHat Linux using kick-start and applying security polices for hardening the server based on the company policies.
Installed RPM and YUM packages patch and another server management.
Managed systems routine backup, scheduling jobs like disabling and enabling cron jobs, enabling system logging, network logging of servers for maintenance, performance tuning and testing.
Worked and performed data-center operations including rack mounting and cabling.
Set up user and group login ID, network configuration, password, resolving permissions issues, user and group quota.
Setup and configured network TCP/IP on AIX including RPC connectivity for NFS.
Installation and configuration of httpd, ftp servers, TCP/IP, DHCP, DNS, NFS and NIS.
Configured multipath, adding SAN and creating physical volumes, volume groups, logical volumes.
Manager, Samba, NFS, NIS, LVM, Linux, Shell Programming.
Worked on daily basis on user access and permissions, Installations and Maintenance of Linux Servers.
Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
Monitored System activity, Performance and Resource utilization.
Performed all System administration tasks like cron jobs, installing packages and patches.
Used LVM extensively and created Volume Groups and Logical volumes.
Performed RPM and YUM package installations, patch and another server management.
Built, implemented and maintained system-level software packages such as OS, Clustering, disk, file management, backup, web applications, DNS, LDAP.
Performed scheduled backup and necessary restoration.
Was a part of the monthly server maintenance team and worked with ticketing tools like BMC remedy on active tickets.
Configured Domain Name System (DNS) for hostname to IP resolution.
Troubleshot and fixed the issues at User level, System level and Network level by using various tools and utilities.
Schedule backup jobs by implementing cron job schedule during non-business hour.

Environment: RHEL, Centos, VMware, Apache, JBOSS, Web Logic, System Authentication, Web sphere, NFS, DNS, SAMBA, Red Hat Linux servers, Oracle RAC, VMware, DHCP.

TECHNICAL SKILLS:

BIG Data Ecosystem: HDFS, MapReduce, Spark, Pig, Hive, Hbase, sqoop, zookeeper, Sentry, Ranger, Storm, Kafka, Oozie, flume, Docker, Hue, Knox, NiFi,Solr

BIG Data Security: Kerberos, AD, LDAP, KTS, KMS, Redaction, Sentry, Ranger, Navencrypt, SSL/TLS, Cloudera Manager, Hortonworks, No SQL Databases, HBase, Cassandra, MongoDB

Programming Languages: Java, Scala, Python, SQL, PL/SQL, Hive-QL, Pig Latin

Frameworks: MVC, Struts, Spring, Hibernate

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Version control: SVN, CVS, GIT

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

Business Intelligence Tools: Talend, Informatica, Tableau

Databases: Oracle … DB2, SQL Server, MySQL, Teradata

Tools: and IDE Eclipse, IntelliJ, NetBeans, Maven, Jenkins, ANT, SBT

Cloud Technologies: Amazon Web Services (Amazon RedShift, S3), Microsoft Azure Insight

Operating Systems: RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8/10

Configuration Management Tool: Clear Case, Remedy ITSM, Putty, Toad, SQL Developer, Rapid SQL, Service Now.

Other Tools: GitHub, Informatica 8.6, Data stage, Maven, JIRA, Quality Center, Rational Suite of Products, MS Test Manager, TFS, Jenkins, Confluence, Splunk, NewRelic.

We provide IT Staff Augmentation Services!

Sr. Cloudera/ Hadoop Administrator Resume

St Petersburg, FL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship