Lead Hadoop Admin Resume
SUMMARY:
- Over all 6+ years of working experience, including with 3+ years of experience as a Hadoop Administration and along with around 8 months of experience in Linux admin related roles.
- As a Hadoop Administration responsibilities include software installation, configuration, software upgrades, backup and recovery, commissioning and decommissioning data nodes, cluster setup, cluster performance and monitoring on daily basis, maintaining cluster on healthy on different Hadoop distributions (Hortonworks& Cloudera)
- Experience in installation, management and monitoring of Hadoop cluster using Apache, Cloudera Manager.
- Optimized the configurations of Map Reduce, pig and hive jobs for better performance.
- Advanced understanding in Hadoop Architecture such as HDFS, Yarn.
- Strong experience configuring Hadoop Ecosystem tools with including Pig, Hive, Hbase, Sqoop, Flume, Kafka, Oozie, Zookeeper, Spark and Storm.
- Experience in designing, installation, configuration, supporting and managing Hadoop Clusters using Apache,Hortonworks and Cloudera.
- Have experience in 15 node cluster step up in Ubuntu Environment.
- Expert - level understanding of the AWS cloud computing platform and related services.
- Experience in managing the Hadoop infrastructure with Cloudera Manager and Ambari.
- Working experience on Importing and exporting data into HDFS and Hive using Sqoop
- Working experience on Import & Export of data using ETL tool Sqoop from MySQL to HDFS
- Working experience on ETL Data Integration tool Talend.
- Strong Knowledge in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
- Experience in Backup configuration and Recovery from a Name Node failure.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Involved in Cluster maintenance, bug fixing, trouble shooting, Monitoring and followed proper backup & Recovery strategies.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker,
- Task Tracker, Name Node, Data Node and MapReduce concepts
- Management of security in Hadoop Clusters using Kerberos, Ranger, Knox, Acl’s.
- Ability to work on Hortonworks, Cloudera.
- Excellent experience in Shell Scripting.
TECHNICAL SKILLS:
Databases: MS SQL Server 2012/2008/2005 , Google cloud platform, MS Excel
Operating Systems: Windows Server 2000/2003/2008 , Windows 10/8/7/XP/Vista, Mac OS, Linux (Red Hat, Ubuntu), Cent OS 6.0
Business Intelligence Tools: MSBI Stack (SSIS, SSRS), Visual Studio 2013/2011/2008/2005
Languages: Linux Commands, SQL Queries, UNIX Shell \Scripting, C
Hadoop Frameworks: HDFS, Spark, Map Reduce, Hive, Pig, Zookeeper
Relational Database: MySQL
NoSQLData Bases: HBase, MongoDB, Cassandra
Data Ingestion / ETL tools: Flume, Sqoop, Storm, Kafka
PROFESSIONAL EXPERIENCE:
Confidential
Lead Hadoop Admin
Responsibilities:
- Lead Hadoop Administrator for all Hadoop Non-Prod Integration environments and supporting technologies as part of the Data Lake ecosystem.
- Maintain, support, monitor and upgrade all Hadoop environments including configuration, access control, capacity planning, permissions and security patches to ensure continuity to all Hadoop environments
- Collaborate closely with platform engineering team to define Hadoop blueprints and automate all configuration activities required to deploy and maintain all Hadoop environments
- Provide daily and weekly utilization and monitoring reporting to drive transparency and ensure continuity of all Hadoop environments
- Provide Hadoop expertise and guidance to all engineering resources to enable self-service and troubleshooting of issues related to ingestion and analytics code
- Apply "rolling" cluster node upgrades in a Production-level environment
- Collaborate closely with Governance team to ensure security and auditability of all Hadoop environments
- Apply and maintain security (Kerberos) linking with Active Directory and manage Knox & Ranger configurations
- Work closely with Architecture to ensure implementation of “best practices” and “guidance” as it relates to all Hadoop environments.
- Define support plan for all Hadoop environments and supporting technologies - including resourcing needs, communication plan, onshore/offshore hand-offs and incident management.
- Perform Knowledge Transfer for offshore Hadoop support roles including documentation on environments, monitoring requirements, access & communication process.
- Create automated monitoring and utilization reports in support of bi-weekly support and operations meetings, to include environment stability, issue resolution and system availability and utilization.
- Define plan and process for blue/green deployments to move data between HDP environments to ensure zero downtime for production users.
- Setup monthly cadence with Hortonworks to review upcoming releases and technologies and review issues or needs.
- Define scale out plan for expanding nodes/storage based on utilization.
- Define upgrade/security patch plan
Confidential, INC, NY
Hadoop Admin
Responsibilities:
- Involved in installing, configuring and using Hadoop Ecosystems (Hortonworks, Cloudera).
- Involved in Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in managing and reviewing Hadoop log files.
- Installation, configuration and administration experience in Big Data platforms Cloudera CDH, Hortonworks Ambari, Apache Hadoop on Redhat, and Centos as a data storage, retrieval, and processing systems
- Involved in development/implementation of Ubuntu Hadoop environment.
- Responsible for managing data coming from different sources.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in creating Hive tables, loading with data and writing hive queries, which will run internally in map.
- Used AWS remote computing services such as S3, EC2.
- Involved in upgrading Hadoop Cluster from HDP 1.3 to HDP 2.0.
- Involved in loading data from UNIX file system to HDFS.
- Tested raw data and executed performance scripts.
- Shared responsibility for administration of Hadoop, Hive and Pig.
Environment: HDFS, Map Reduce, Hive, Pig Hbase, Sqoop, Cloudera, RDBMS/DB used: Flat files, Mysql, Hbase.
Confidential, VA
Hadoop Admin
Responsibilities:
- Installed, Configured and Maintained the Hadoop cluster for application development and Hadoop ecosystem components like Hive, Pig, HBase, Zookeeper and Sqoop.
- In depth understanding of Hadoop Architecture and various components such as HDFS, Name Node, Data Node, Resource Manager, Node Manager and YARN / Map Reduce programming paradigm.
- Monitoring Hadoop Cluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics and Charge Back customers on their Usage.
- Extensively worked on commissioning and decommissioning of cluster nodes, file system integrity checks and maintaining cluster data replication.
- Very good understanding and knowledge of assigning number of mappers and reducers to Map reduce cluster.
- Setting up HDFS Quotas to enforce the fair share of computing resources.
- Strong Knowledge in Configuring and maintaining YARN Schedulers (Fair, and Capacity)
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics.
- Support for parallel data load into Hadoop.
- Involved in setting up HBase which includes master and region server configuration, High availability configuration, performance tuning and administration.
- Created user accounts and provided access to the Hadoop cluster.
- Upgraded cluster from CDH 5.3 to CDH 5.7 and Cloudera manager from CM 5.3 to 5.7.
- Involved in loading data from UNIX file system to HDFS.
- Worked on ETL process and handled importing data from various data sources, performed transformations.
- Coordinates with QA team during testing phase.
Environment: HDFS, Hive, Sqoop, Zookeeper and HBase, Unix Linux Java, HDFS Map Reduce, Pig, Hive, HBase, Flume, Kafka, Sqoop, Shell Scripting.
Confidential
Linux /Hadoop Admin
Responsibilities:
- Provide technical designs, architecture, Support automation, installation and configuration tasks and upgrades and planning system upgrades of Hadoop cluster.
- Design development and architecture of the Hadoop cluster, map reduce processes, Hbase system.
- Design and develop process framework and support data migration in Hadoop system.
- Involved in upgrading Hadoop Cluster from HDP 1.3 to HDP 2.0.
- Implemented secondary sorting to sort reducer output globally in MapReduce.
- Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters.
- Commissioning and Decommissioning Nodes from time to time.
- Work with network and Linux system engineers to define optimum network configurations, server hardware and operating system.
- Evaluate and propose new tools and technologies to meet the needs of the organization.
- Production support responsibilities include cluster maintenance.
Environment: Hadoop, HDFS, Hbase, Hive, MapReduce, LINUX, and Big Data.
Confidential
Linux /Hadoop Admin
Responsibilities:
- Implemented and Configured High Availability Hadoop Cluster.
- In depth understanding of Hadoop Architecture and various components such as HDFS, Name node, Data node, Resource Manager, Node Manager and YARN / Map-Reduce programming paradigm.
- Involved in managing and reviewing Hadoop log files.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Designed, recorded, and executed macros to automate data entry inputs. Formatted spreadsheets and workbooks for print, document reproduction, and presentations.
- Created HBase tables to store various data formats of data coming from different portfolios.
- Used Cloudera manager for installation and management of Hadoop cluster.
- Administration, installing, upgrading and managing distributions of Hadoop (CDH3, CDH4, Cloudera manager),Hive, HBase.
- Hands on experience working on Hadoop ecosystem components like HDFS, Map-Reduce, YARN, Zookeeper, Pig, Hive, Sqoop, Flume.
- Developed a data pipeline using Kafka and Storm to store data into HDFS.
- Collected, organized, and documented infrastructure project attributes, data, and project metrics.
Environment: SQL Server 2005, T-SQL, Hadoop, Map-reduce, YARN, Pig, Hive, HBase, Cassandra, MS Office & Windows.