We provide IT Staff Augmentation Services!

Hadoop Administrator Resume

3.00/5 (Submit Your Rating)

Houston, TX

SUMMARY

  • Over 7 years of professional Information Technology experience in Hadoop and Linux Administration activities such as installation, configuration and maintenance of systems/clusters.
  • Experience in all the phases of Data warehouse life cycle involving Requirement Analysis, Design, Coding, Testing, and Deployment.
  • Experience in working with business analysts to identify study and understand requirements and translated them into ETL code in Requirement Analysis phase.
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters, MapR, Hortonworks & Cloudera Hadoop Distribution.
  • Experience in managing the Hadoop infrastructure with Cloudera Manager.
  • Good Understanding in Kerberos and how it interacts with Hadoop and LDAP.
  • Practical knowledge on functionalities of every Hadoop daemons, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
  • Experience in understanding and managing Hadoop Log Files.
  • Experience in understanding hadoop multiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.
  • Experience in Adding and removing the nodes in Hadoop Cluster.
  • Worked extensively with Amazon Web Services and Created Amazon Elastic Map Reduce cluster in both 1.0.3 and 2.2.
  • Experience in Change Data Capture (CDC) data modeling approaches.
  • Experience in managing the hadoop cluster with Horton works Distribution Platform.
  • Experience in extracting the data from RDBMS into HDFS Sqoop.
  • Experience in bulk load tools such as DW Loader and move data from PDW to Hadoop archive.
  • Experience in collecting the logs from log collector into HDFS using up Flume.
  • Experience in setting up and managing the batch scheduler Oozie.
  • Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
  • Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa.
  • Experience in Installing Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems.

TECHNICAL SKILLS

Technologies: HDFS, SQL, YARN, PIG Latin, MapReduce, Hive, SqOop, Spark, Spark Sql, Zookeeper, Hbase, Oozie, ab initio, Informatica, AWS.

Big Data Platforms: MapR, hortonworks, Cloudera.

Operating Systems: Linux, Windows, UNIX.

Databases: Oracle, MySQL, MSSQL, HBase, Cassandra.

Development Methods: Agile/Scrum, Waterfall.

Programming Languages: JavaScript, Python, R, Shell Scripting.

PROFESSIONAL EXPERIENCE

Hadoop Administrator

Confidential, Houston, TX

Responsibilities:

  • Capacity planning, Architecting and designing Hadoop cluster from scratch.
  • Designing service layout with HA enabled
  • Performed pre-installation and post-installation benchmarking and performance testing’s.
  • Designed and implemented the Disaster Recovery mechanism for data, eco-system tools and applications
  • Orchestrated data and service High availability within and across the clusters
  • Performed multiple rigorous DR testing
  • Training, mentoring and supporting team members
  • Developing reusable configuration management platform in Ansible and GitHub.
  • Moving the Services (Re-distribution) from one Host to another host within the Cluster to facilitate securing the cluster and ensuring High availability of the services
  • Working to implement MapR stream to facilitate real-time data ingestion to meet business needs
  • Implementing Security on MapR cluster using BOKS and by encrypting the data on fly
  • Identifying the best solutions/ Proof of Concept leveraging Big Data & Advanced Analytics that meet and exceed the customer's business, functional and technical requirements
  • Created and published various production metrics including system performance and reliability information to systems owners and management.
  • Performed ongoing capacity management forecasts including timing and budget considerations.
  • Coordinated root cause analysis (RCA) efforts to minimize future system issues.
  • Experience in mentor, develop and train junior staff members as needed.
  • Provided off hours support on a rotational basis.
  • Store unstructured data in semi structure format on HDFS using HBase.
  • Used Change management and Incident management process following organization guidelines.
  • Responded to resolve database access and performance issues.
  • Planed and coordinated data migrations between systems.
  • Performed database transaction and security audits.
  • Established appropriate end-user database access control levels.
  • On-call availability for rotation on nights and weekends.
  • Upgraded MapR 4.1.0 to 5.2.0 version.
  • Experience in hbase replication and maprdb replication setup between two clusters.
  • Good knowledge of Hadoop cluster connectivity and security.
  • Experience in MapRDB, Spark, Elastic search and Zeppelin.
  • Involved in POCs like application monitoring tool Unravel.
  • Experience in configuration management tool Ansible.
  • Responding to database related alerts and escalations and working with database engineering to come up with strategic solutions to recurring problems.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Solr, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH5, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux

Hadoop Admin

Confidential

Responsibilities:

  • Handle the installation and configuration of a Hadoop cluster.
  • Build and maintain scalable data pipelines using the Hadoop ecosystem and other open source components like Hive and HBase.
  • Handle the data exchange between HDFS and different Web Applications and databases using Flume and Sqoop.
  • Good understanding in Microsoft Analytics Platform System (APS) HdInsight.
  • Monitor the data streaming between web sources and HDFS.
  • Worked in Kerberos and how it interacts with Hadoop and LDAP.
  • Worked on kafka distributed, partitioned, replicated commit log service and provides the functionality of a messaging system.
  • Experience working in AWS Cloud Environment like EC2 & EBS.
  • Close monitoring and analysis of the Map Reduce job executions on cluster at task level.
  • Inputs to development regarding the efficient utilization of resources like memory and CPU utilization based on the running statistics of Map and Reduce tasks.
  • Experience in a software intermediary that makes it possible for application programs to interact with each other and share data.
  • Worked extensively with Amazon Web Services and Created Amazon Elastic Map Reduce cluster in both 1.0.3 and 2.2.
  • Worked in Kerberos, Active Directory/LDAP, Unix based File System.
  • Managed data in Amazon S3, implemented s3 cmd to move data from clusters to S3.
  • Presented Demos to customers how to use AWS and how it is different from traditional systems.
  • It's often an implementation of REST that exposes specific software functionality while protecting the rest of the application in API.
  • Experience in Continuous Integration and expertise in Jenkins and Hudson tools.
  • Experience in Nagios and writing plugins for Nagios to perform the multiple server checks.
  • Changes to the configuration properties of the cluster based on volume of the data being processed and performance of the cluster.
  • Setting up Identity, Authentication, and Authorization.
  • Maintaining Cluster in order to remain healthy and in optimal working condition.
  • Handle the upgrades and Patch updates.
  • Set up automated processes to analyze the System and Hadoop log files for predefined errors and send alerts to appropriate groups.
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop, Hortonworks Distribution.
  • Worked in AVRO and Json and other compression.
  • Worked in unix commands and Shell Scripting.

Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Cloudera, Flume, HBase, ZooKeeper, CDH3, MongoDB, Cassandra, Oracle, NoSQL and Unix/Linux.

Hadoop Admin

Confidential, Palo Alto, CA

Responsibilities:

  • Deployed a Hadoop cluster using Hortonworks distribution HDP integrated with Nagios and Ganglia.
  • Monitored workload, job performance and capacity planning using Ambari.
  • Imported logs from web servers with Flume to ingest the data into HDFS.
  • Implemented Fair scheduler on the job tracker to allocate the fair amount of resources to small jobs.
  • Performed operating system installation, Hadoop version updates using automation tools.
  • Deployed high availability on the Hadoop cluster quorum journal nodes.
  • Implemented automatic failover zookeeper and zookeeper failover controller.
  • Installed, Configured and maintained HBASE.
  • Designed the authorization of access for the Users using SSSD and integrating with Active Directory.
  • Integrated all the clusters Kerberos with Company’s Active Directory and created USERGROUPS and PERMISSIONS for authorized access in to the cluster.
  • Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
  • Configured Oozie for workflow automation and coordination.
  • Implemented rack aware topology on the Hadoop cluster.
  • Implemented Kerberos security in all environments.
  • Implemented Kerberos authentication infrastructure- KDC server setup, creating realm /domain, managing principles, generating key tab file for each and every service and managing key tab using key tab tools.
  • Defined file system layout and data set permissions.
  • Good experience in troubleshoot production level issues in the cluster and its functionality.
  • Backed up data on regular basis to a remote cluster using distcp.
  • Regular Ad-Hoc execution of Hive and Pig queries depending upon the use cases.
  • Commissioning and Decommissioning of nodes depending upon the amount of data.
  • Monitored and configured a test cluster on amazon web services for further testing process and gradual migration.

Environment: HADOOP HDFS, MAPREDUCE, HIVE, PIG, FLUME, OOZIE, SQOOP, ECLIPSE, HORTONWORKS AMBARI, WINSCP, PUTTY.

Linux Admin

Confidential

Responsibilities:

  • Performed installation, configuration, upgrades, Package administration and support for Linux Systems on client side using Redhat satellite network server.
  • Worked on Red Hat Linux installation, configuring and maintenance of applications on this environment.
  • Build servers using Kick Start, and VSphere Client.
  • Worked exclusively on VMware virtual environment.
  • Accomplished the Installation, Configuration and Administration of Web & Application Servers.
  • Performed automated installations of Operating System using kickstart for Linux.
  • Package management using RPM, YUM and UP2DATE in Red Hat Linux.
  • Experience in using various network protocols like HTTP, UDP, FTP, and TCP/IP.
  • Network installation via centralized yum server for client package update
  • Network Configuration on LINUX.
  • Configuration and Administration of NFS, NIS, and DNS in LINUX environment.
  • Implemented file sharing on network by configuring NFS on the system to share essential resources
  • Troubleshooting and resolving network issues
  • Documentation of activities performed and making the standard operating procedure (SOP)
  • Configuration like assigning IP addresses, configuring network interfaces, assigning static routes, hostnames etc.

Environment: Red Hat Enterprise Linux, VMWare, Shell-Scripting, LVM, Windows, RPM, YUM, NFS, HTTP, FTP.

Linux System Administrator

Confidential

Responsibilities:

  • Designed Integrate Screens with Java Swings for displaying the transactions.
  • Involved in the development of code for connecting to the database using JDBC with the help of Oracle Developer 9i.
  • Involved in the development of database coding including Procedures, Triggers in Oracle.
  • Worked as Research Assistant and a Development Team Member.
  • Coordinated with Business Analysts to gather the requirement and prepare data flow diagrams and technical documents.
  • Identified Use Cases and generated Class, Sequence and State diagrams using UML.
  • Used JMS for the asynchronous exchange of critical business data and events among J2EE components and the legacy system.
  • Worked in Designing, coding, and maintaining of Entity Beans and Session Beans using EJB 2.1 Specification.
  • Worked in the development of Web Interface using MVC Struts Framework.
  • User Interface was developed using JSP and tags, CSS, HTML, and JavaScript.
  • Database connection was made using properties files.
  • Used Session Filter for implementing timeout for ideal users.
  • Used Stored Procedure to interact with the database.
  • Development of Persistence was done using DAO and Hibernate Framework.
  • Used Log4j for logging.

Environment: Red hat Linux/CentOS4, 5, Logical Volume Manager, Hadoop, VMware ESX 3.0, Kernel and resource tuning, Apache and Tomcat Web Server, Oracle 9g, Oracle RAC, HPSM, HPSA

We'd love your feedback!