Hadoop Admin Resume Detroit, MI - Hire IT People

SUMMARY:

Big Data professional with 5 years of practical experience in all phases of SDLC including application design, Administration, production support & maintenance projects and Java with 4 years of experience in Hadoop eco system .
Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS Yarn, Spark and MapReduce programming paradigm.
Implemented Hadoop based data warehouses, integrated Hadoop with Enterprise Data Warehouse systems.
Hands on experience in installing, configuring Cloudera, MapR, Hortonworks clusters and using Hadoop ecosystem components like Hadoop Pig, Hive, HBase, Sqoop, Kafka, Oozie, Flume, Zookeeper.
Balance, commission & decommission cluster nodes.
Maintain, support, and upgrade Hadoop clusters.
Provide guidance on Hadoop cluster setup on AWS cloud environment
Good Knowledge on NoSQL databases such as HBase, Cassandra and MongoDB.
Experienced with networking/Security infrastructure including VLAN, firewalls, Kerberos, LDAP, Sentry, Ranger
Working with technologies and platforms including JAVA, GIT, Unix/Linux, VMWare, Docker, AWS, across the Financial, HealthCare and Media Sectors.
Strong Knowledge on Hadoop file formats Avro, parquet.
Monitor jobs, Fair and Capacity scheduler queues, and HDFS capacity.
Scheduled Jobs using Talend Administrator center, Cron.
Administration and tuning on Search and Index frameworks like Solr and/or Elastic Search
Production experience in large environments using configuration management tools Ansible .
Monitor and manage Server backup and restore Server status reporting, Managing user accounts, password policies and files permissions.
Experience in managing and reviewing Hadoop log files.

TECHNICAL SKILLS:

Big Data Ecosystem:: HDFS Yarn, Apache Spark, Avro, MapReduce, Hbase, Zookeeper, Hive Pig, Sqoop, Oozie, Flume, Talend.

Hadoop Distributions: Cloudera CDH 4, MapR, Hortonworks HDP, Amazon Web Services, Amazon EC2, S3

Languages: Core Java, Python, SQL, PL/SQL.

Database: Oracle 11g/10g, DB2, MySQL, MongoDb, Cassandra.

Analysis and Reporting Tools: Splunk, Tableau

IDE / Testing Tools: Eclipse

Operating System: Windows, Cent OS 6.5, Redhat 7

Scripting Languages: Bash, Perl, Python

Testing API: JUNIT

PROFESSIONAL EXPERIENCE:

Confidential, Detroit, MI

Hadoop Admin

Responsibilities:

Responsible for Cluster maintenance, Adding and removing cluster nodes, Deployed and maintained 40 Node Cluster, 20 Prod, 8 Test and 12 Dev and Cluster Monitoring.
Preformed Installation, adding and replacement of resources like Disks, CPU’s and Memory, NIC Cards, increasing the swap and Maintenance of Linux/UNIX and Windows Servers.
Configuring CLDB Node Setup, Warden and Storage Pool Group Disk Setup.
Implemented centralized user authentication using LDAP.
Setup, Implementation, Configuration, documentation of Backup/Restore solutions for Disaster/Business Recovery
Configured and monitoring the operational data using Kafka.
Implemented High Availability and HDFS federation.
Creating Volumes, Snapshots, Local Mirror and Remote Mirror.
Developed shell scripts along with setting up of CRON jobs for monitoring and automated data backup on cluster.
Experience in Unix SHELL Scripting.
Troubleshooting, Manage and review data backups, Manage and review Hadoop log files.
Deployed Datalake cluster with Hortonworks Ambari on AWS using EC2 and S3.
Created scripts which integrated with Amazon API to control instance operations.
Implemented High Availablity.
Involved in loading data to HDFS from various sources.
Responsible on - boarding new users to the Hadoop cluster (adding user a home directory and providing access to the datasets).
Helped in setting up Rack topology in the cluster.
I nstalled Oozie workflow engine to run multiple Hive and pig jobs.
Installed and configured Hadoop ecosystem like HBase, Flume, Pig and Sqoop.
Implemented Rack Awareness for data locality optimization.
Managed and reviewed Hadoop Log files.
Developed Shell/Perl Scripts for automation purpose.
Implemented Kerberos in cluster to authenticate users.
Worked closely with software developers and DevOps to debug software and system problems.
Used Ansible to automate Configuration management.
Deployed and used Ansible dashboard for configuration management to existing infrastructure.
Migrated 100 TB of data from one datacenter to another datacenter.
Performed Stress and Performance testing, benchmark on the cluster.
Tuned the cluster to achieve maximum throughput and execution time based on the benchmarking results

ENVIRONMENT: MapR 5.2, Hortonworks, AWS, Pig, Hive, Hbase, Spark, MapReduce, Sqoop, Splunk, Flume, Zookeeper, Kafka.

Confidential, Boston, MA

Hadoop Admin / DevOps

Responsibilities:

Installed/Configured/Maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Involved in installation of MapR and upgrade from MapR 5.0 to MapR 5.2.
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Involved in clustering of Hadoop in the network of 70 nodes.
Experienced in loading data from UNIX local file system to HDFS.
Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Involved in developing work flow Map Reduce jobs using Oozie framework.
Collected the logs data from web servers and integrated in to HDFS using Flume.
Worked on upgrading cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Use of Sqoop to import and export data from RDBMS to HDFS and vice-versa.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Automated workflows using shell scripts to pull data from various databases into Hadoop.
Experienced in Installing, Configuring, Monitoring, Maintaining and Troubleshooting Hadoop clusters.
Involved in Cluster Capacity planning, Hardware planning, Installation, Performance Tuning of the Hadoop Cluster.
Configured various property files like core-site.xml, hdfs-site.xml, yarn-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
Manage deployment automation using Puppet and Ansible.
Implementing a Continuous Integration and Continuous Deployment framework using Jenkins, Maven Artifactory in Linux environment.
Utilized cluster co-ordination services through ZooKeeper.
Designed, implemented and managed the Backup and Recovery environment.
Managing and scheduling Jobs on a Hadoop cluster using Oozie.
Performance tuned the Hadoop cluster to improve the efficiency.

ENVIRONMENT:: MapR, Talend, Kafka, Spark 1.3.1, Flume, Zookeeper, Tableau, Splunk, SQL Server, HBase, Teradata SQL,PL/SQL, LINUX, UNIX Shell Scripting. GIT.

Confidential

Linux Systems Engineer

Responsibilities:

Installed, configured and administered RHEL 6 on VMware server 3.5.
Managed file space and created logical volumes, extended file systems using LVM.
Performed daily maintenance of servers and tuned system for optimum performance by turning off unwanted peripheral and vulnerable service.
Managed RPM Package for Linux distributions
Monitored system performance using TOP, FREE, VMSTAT & IOSTAT.
Set up user and group login ID's, password, ACL file permissions, and assigned user and group quota
Configured networking including TCP/IP and troubleshooting.
Coordinated Firewall rules to enable communication between servers.
Monitored scheduled jobs, workflows, and related to day to day system administration.
Extensively involved in Installation and configuration of Cloudera distribution Hadoop CDH 3.x, CDH 4.x.
Installed/Configured/Maintained Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Hands on experience working with HDFS, MapReduce, Hive, Pig, Sqoop, Impala, Hadoop HA, Yarn, Cloudera Manager, Hue
Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Involved in clustering of Hadoop in the network of 12 nodes.
Implemented Name node backup using NFS.
Experienced in loading data from UNIX local file system to HDFS.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Collected the logs data from web servers and integrated in to HDFS using Flume.
Worked on upgrading cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
I nvolved in the installation of CDH3 and up-gradation from CDH3 to CDH4.
Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
Use of Sqoop to import and export data from RDBMS to HDFS and vice-versa.
Created Hive External tables and loaded the data in to tables and query data using HQL.
Wrote shell scripts for rolling day-to-day processes and it is automated.
Automated workflows using shell scripts to pull data from various databases into Hadoop.
Respond to tickets through ticketing systems.

ENVIRONMENT: Cloudera, Redhat, Pig, Hive, Hbase, Solr, Zookeeper, Linux.

We provide IT Staff Augmentation Services!

Hadoop Admin Resume

Detroit, MI

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship