Hadoop Administrator Resume Houston, TX - Hire IT People

SUMMARY

Over 7 years of professional Information Technology experience in Hadoop and Linux Administration activities such as installation, configuration and maintenance of systems/clusters.
Experience in all the phases of Data warehouse life cycle involving Requirement Analysis, Design, Coding, Testing, and Deployment.
Experience in working with business analysts to identify study and understand requirements and translated them into ETL code in Requirement Analysis phase.
Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters, MapR, Hortonworks & Cloudera Hadoop Distribution.
Experience in managing the Hadoop infrastructure with Cloudera Manager.
Good Understanding in Kerberos and how it interacts with Hadoop and LDAP.
Practical knowledge on functionalities of every Hadoop daemons, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
Experience in understanding and managing Hadoop Log Files.
Experience in understanding hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.
Experience in Adding and removing the nodes in Hadoop Cluster.
Worked extensively with Amazon Web Services and Created Amazon Elastic Map Reduce cluster in both 1.0.3 and 2.2.
Experience in Change Data Capture (CDC) data modeling approaches.
Experience in managing the hadoop cluster with Horton works Distribution Platform.
Experience in extracting the data from RDBMS into HDFS Sqoop.
Experience in bulk load tools such as DW Loader and move data from PDW to Hadoop archive.
Experience in collecting the logs from log collector into HDFS using up Flume.
Experience in setting up and managing the batch scheduler Oozie.
Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa.
Experience in Installing Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems.

TECHNICAL SKILLS

Technologies: HDFS, SQL, YARN, PIG Latin, MapReduce, Hive, SqOop, Spark, Spark Sql, Zookeeper, Hbase, Oozie, ab initio, Informatica, AWS.

Big Data Platforms: MapR, hortonworks, Cloudera.

Operating Systems: Linux, Windows, UNIX.

Databases: Oracle, MySQL, MSSQL, HBase, Cassandra.

Development Methods: Agile/Scrum, Waterfall.

Programming Languages: JavaScript, Python, R, Shell Scripting.

PROFESSIONAL EXPERIENCE

Hadoop Administrator

Confidential, Houston, TX

Responsibilities:

Capacity planning, Architecting and designing Hadoop cluster from scratch.
Designing service layout with HA enabled
Performed pre-installation and post-installation benchmarking and performance testing’s.
Designed and implemented the Disaster Recovery mechanism for data, eco-system tools and applications
Orchestrated data and service High availability within and across the clusters
Performed multiple rigorous DR testing
Training, mentoring and supporting team members
Developing reusable configuration management platform in Ansible and GitHub.
Moving the Services (Re-distribution) from one Host to another host within the Cluster to facilitate securing the cluster and ensuring High availability of the services
Working to implement MapR stream to facilitate real-time data ingestion to meet business needs
Implementing Security on MapR cluster using BOKS and by encrypting the data on fly
Identifying the best solutions/ Proof of Concept leveraging Big Data & Advanced Analytics that meet and exceed the customer's business, functional and technical requirements
Created and published various production metrics including system performance and reliability information to systems owners and management.
Performed ongoing capacity management forecasts including timing and budget considerations.
Coordinated root cause analysis (RCA) efforts to minimize future system issues.
Experience in mentor, develop and train junior staff members as needed.
Provided off hours support on a rotational basis.
Store unstructured data in semi structure format on HDFS using HBase.
Used Change management and Incident management process following organization guidelines.
Responded to resolve database access and performance issues.
Planed and coordinated data migrations between systems.
Performed database transaction and security audits.
Established appropriate end-user database access control levels.
On-call availability for rotation on nights and weekends.
Upgraded MapR 4.1.0 to 5.2.0 version.
Experience in hbase replication and maprdb replication setup between two clusters.
Good knowledge of Hadoop cluster connectivity and security.
Experience in MapRDB, Spark, Elastic search and Zeppelin.
Involved in POCs like application monitoring tool Unravel.
Experience in configuration management tool Ansible.
Responding to database related alerts and escalations and working with database engineering to come up with strategic solutions to recurring problems.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Solr, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH5, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux

Hadoop Admin

Confidential

Responsibilities:

Handle the installation and configuration of a Hadoop cluster.
Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
Good understanding in Microsoft Analytics Platform System (APS) HdInsight.
Monitor the data streaming between web sources and HDFS.
Worked in Kerberos and how it interacts with Hadoop and LDAP.
Worked on kafka distributed, partitioned, replicated commit log service and provides the functionality of a messaging system.
Experience working in AWS Cloud Environment like EC2 & EBS.
Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
Experience in a software intermediary that makes it possible for application programs to interact with each other and share data.
Worked extensively with Amazon Web Services and Created Amazon Elastic Map Reduce cluster in both 1.0.3 and 2.2.
Worked in Kerberos, Active Directory/LDAP, Unix based File System.
Managed data in Amazon S3, implemented s3 cmd to move data from clusters to S3.
Presented Demos to customers how to use AWS and how it is different from traditional systems.
It's often an implementation of REST that exposes specific software functionality while protecting the rest of the application in API.
Experience in Continuous Integration and expertise in Jenkins and Hudson tools.
Experience in Nagios and writing plugins for Nagios to perform the multiple server checks.
Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
Setting up Identity, Authentication, and Authorization.
Maintaining Cluster in order to remain healthy and in optimal working condition.
Handle the upgrades and Patch updates.
Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
Experience in architecting, designing, installation, configuration and management of Apache Hadoop, Hortonworks Distribution.
Worked in AVRO and Json and other compression.
Worked in unix commands and Shell Scripting.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux.

Hadoop Admin

Confidential, Palo Alto, CA

Responsibilities:

Deployed a Hadoop cluster using Hortonworks distribution HDP integrated with Nagios and Ganglia.
Monitored workload, job performance and capacity planning using Ambari.
Imported logs from web servers with Flume to ingest the data into HDFS.
Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
Performed operating system installation, Hadoop version updates using automation tools.
Deployed high availability on the Hadoop cluster quorum journal nodes.
Implemented automatic failover zookeeper and zookeeper failover controller.
Installed, Configured and maintained HBASE.
Designed the authorization of access for the Users using SSSD and integrating with Active Directory.
Integrated all the clusters Kerberos with Company’s Active Directory and created USERGROUPS and PERMISSIONS for authorized access in to the cluster.
Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
Configured Oozie for workflow automation and coordination.
Implemented rack aware topology on the Hadoop cluster.
Implemented Kerberos security in all environments.
Implemented Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing principles, generating key tab file for each and every service and managing key tab using key tab tools.
Defined file system layout and data set permissions.
Good experience in troubleshoot production level issues in the cluster and its functionality.
Backed up data on regular basis to a remote cluster using distcp.
Regular Ad-Hoc execution of Hive and Pig queries depending upon the use cases.
Commissioning and Decommissioning of nodes depending upon the amount of data.
Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, HORTONWORKS AMBARI, WINSCP, PUTTY.

Linux Admin

Confidential

Responsibilities:

Performed installation, configuration, upgrades, Package administration and support for Linux Systems on client side using Redhat satellite network server.
Worked on Red Hat Linux installation, configuring and maintenance of applications on this environment.
Build servers using Kick Start, and VSphere Client.
Worked exclusively on VMware virtual environment.
Accomplished the Installation, Configuration and Administration of Web & Application Servers.
Performed automated installations of Operating System using kickstart for Linux.
Package management using RPM, YUM and UP2DATE in Red Hat Linux.
Experience in using various network protocols like HTTP, UDP, FTP, and TCP/IP.
Network installation via centralized yum server for client package update
Network Configuration on LINUX.
Configuration and Administration of NFS, NIS, and DNS in LINUX environment.
Implemented file sharing on network by configuring NFS on the system to share essential resources
Troubleshooting and resolving network issues
Documentation of activities performed and making the standard operating procedure (SOP)
Configuration like assigning IP addresses, configuring network interfaces, assigning static routes, hostnames etc.

Environment: Red Hat Enterprise Linux, VMWare, Shell-Scripting, LVM, Windows, RPM, YUM, NFS, HTTP, FTP.

Linux System Administrator

Confidential

Responsibilities:

Designed Integrate Screens with Java Swings for displaying the transactions.
Involved in the development of code for connecting to the database using JDBC with the help of Oracle Developer 9i.
Involved in the development of database coding including Procedures, Triggers in Oracle.
Worked as Research Assistant and a Development Team Member.
Coordinated with Business Analysts to gather the requirement and prepare data flow diagrams and technical documents.
Identified Use Cases and generated Class, Sequence and State diagrams using UML.
Used JMS for the asynchronous exchange of critical business data and events among J2EE components and the legacy system.
Worked in Designing, coding, and maintaining of Entity Beans and Session Beans using EJB 2.1 Specification.
Worked in the development of Web Interface using MVC Struts Framework.
User Interface was developed using JSP and tags, CSS, HTML, and JavaScript.
Database connection was made using properties files.
Used Session Filter for implementing timeout for ideal users.
Used Stored Procedure to interact with the database.
Development of Persistence was done using DAO and Hibernate Framework.
Used Log4j for logging.

Environment: Red hat Linux/CentOS4, 5, Logical Volume Manager, Hadoop, VMware ESX 3.0, Kernel and resource tuning, Apache and Tomcat Web Server, Oracle 9g, Oracle RAC, HPSM, HPSA

We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

Houston, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship