Sr. Hadoop Administrator Resume
Austin, TX
SUMMARY
- 10+ years of professional IT experience which includes around 4 years of proven experience inBig Data Technologies / Framework like Hadoop, HDFS, YARN, MapReduce, HBase, Hive, Pig, Sqoop, Flume, Oozie, NoSQL.
- Extensive experience and knowledge of processing BigData using Hadoop ecosystem components HDFS, Map Reduce, YARN, HBase, Hive, Pig, Sqoop, Flume, Oozie.
- Well versed in installing, upgrading and maintaining Cloudera (CDH5) distributions for Hadoop.
- Hands - on experience with Hadoop applications (such as administration, configuration management, monitoring, debugging, and performance tuning).
- Strong understanding of NoSql databases like HBase and MongoDB.
- Good understanding of CDH implementation of Hadoop using Cloudera Manager.
- Strong knowledge on Hadoop HDFS architecture and MapReduce framework.
- Strong understanding of workflow automation tools like Oozie.
- Familiar with job scheduling using Capacity Scheduler & Fair Scheduler.
- Very Good experience on Unix, Linux scripting
- Support development, testing, and operations teams during new system deployments.
- Evaluate and propose new tools and technologies to meet the needs of the organization.
- Always worked closely with system users and management personnel and gained reputation of loyalty.
- Extensive experience in Interwoven Teamsite Application administration.
- GMS (Globalization Management System) Application administration.
- Certified Professional in ITIL V3.
- Experience with Incident management, Release management and Change management.
- Proficient in handling escalations and micro management.
- Hands-On experience in transitioning the GBR Service desk from London, UK.
- Good communication skills, interpersonal skills, self-motivated, quick learner, team player.
TECHNICAL SKILLS
Big Data Tools: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Flume, Oozie
Hadoop Distribution: Cloudera Distribution of Hadoop (CDH)
Operating Systems: Unix, Linux, Windows XP, Windows Vista, Windows 2003 Server
Programming Languages: Java, Pl SQL, Shell Script
Tools: Interwoven Teamsite, GMS, BMC Remedy, Eclipse, ToadSQL Server Management Studio
Database: MySQL, NoSQL, HBase, MongoDB
Processes: Incident Management, Release Management, Change Management
Office Tools: MS Outlook, MS Word, MS Excel, MS Powerpoint
PROFESSIONAL EXPERIENCE
Confidential, Austin, TX
Sr. Hadoop Administrator
RESPONSIBILITIES:
- Managed 17 production clusters, hosting different applications, including 2 dedicated HBase clusters hosting *pay (Apple Pay, Android Pay, Samsung Pay, etc)
- All the clusters are managed using different CDH versions ranging from CDH 5.4.3 for HBase clusters & CDH 5.8.3 for our multi tenant production clusters, with underlying Apache Hadoop version 2.6.0.
- Provided L1 support for multiple applications hosted on our clusters including HDFS, YARN, Hive, Flume, Solr, Oozie, Spark, Kafka, HBase, Impala, Zookeeper, Kerberos, etc.
- Provided L1/L2 Linux Administration support for the servers hosting our clusters, ensuring optimal utilization of CPU vCores, Memory & Disc.
- This included taking care of high disc utilization by performing periodic cleanup and on-demand cleanup.
- Well versed with advanced linux concepts including awk, sed, crontab and storage concepts to manage linux mounts.
- Performed performance tuning by implementing G1GC garbage collection and tuning the heap size for the applications based on the heap utilization history using CM charts.
- Good grip on Kerberos concepts and commands, to manage tickets for principals and also other administration activities using kadmin shell.
- Setup load balancing for multiple applications including Solr & Impala, using HAProxy.
- Used Cloudera Manager APIs to perform a series of different activities, like pulling out Hive Metastore database password, etc.
- Used HDFS Inter-node balancer & Intra-node balancer and also set different balancer bandwidth based on the requirements and cluster capacity.
- Performed periodic maintenance and CDH upgrade, using the parcel method.
- Performed application validation post every maintenance to ensure that the applications are working as expected.
- Basic understanding of GPFS, as we used GPFS to host the application input / output data, using storage pools in GPFS.
- Managed Capacity scheduler and Fair scheduler, by defining dedicated queues for each application ID and allocating the min-max resources based on the criticality of the jobs run using these IDs.
- Job monitoring using Resource Manager and taking necessary actions if the job is not progressing, by either killing the job if it is hung or increasing the priority of the job in real time.
- Got experience in Namenode migration from the existing server to a new server, as the existing IBM Dataplex machine had to be decommissioned for good.
- Worked with Cloudera Support to raise cases whenever we were unable to resolve the issue in-house and got on webex calls with them to resolve the issue asap.
- Worked on production issues on priority and resolved them at the earliest, to avoid any potential downtime.
- Performed decommissioning of IBM nodes as they reached end-of-life support and recommissioning of nodes due to capacity planning requirements.
- Trained new team members on the hadoop / linux concepts and also our cluster specific information including the applications hosted in them.
Environment: HDFS, YARN, Hive, Flume, Solr, Oozie, Spark, Kafka, HBase, Impala, Zookeeper, Kerberos.
Confidential, Santa Clara, CA
Hadoop Administrator
RESPONSIBILITIES:
- Managed mission-critical Hadoop cluster at production scale, especially Cloudera distribution.
- Involved in capacity planning, with reference to the growing data size and the existing cluster size.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase NoSQL databases, Flume, Oozie and Sqoop.
- Experience in designing, implementing and maintaining of high performing Hadoop clusters and integrating them with existing infrastructure.
- Monitoring and analyzing MapReduce jobs and look out for any potential issues and address them.
- Collected the logs data from web servers and integrated into HDFS using Flume.
- Importing and exporting data in HDFS and Hive using Sqoop.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
- Commissioning and Decommissioning Hadoop Cluster nodes Including Balancing HDFS block data.
- Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
- Good knowledge in implementing NameNode Federation and High Availability of Name Node and Hadoop Cluster using ZooKeeper and Quorum-Journal Manager.
- Good knowledge in adding security to the cluster using Kerberos.
- Hands-On experience in setting up ACL (Access Control Lists) to secure access to the HDFS filesystem.
- Experience in cluster management using Cloudera Manager.
- Installed Oozie workflow engine to run multiple Hive Jobs.
- Good knowledge in working with Hue interface for querying the data.
- Fine Tuned JobTracker by changing few properties mapred-site.xml
- Fine Tuned Hadoop cluster by setting proper number of map and reduced slots for the TaskTrackers.
- Experience in tuning the heap size to avoid any disk spills and to avoid OOM issues.
- Familiar with job scheduling using Fair Scheduler so that CPU time is well distributed amongst all the jobs.
- Experience managing users and permissions on the cluster, using different authentication methods.
- Involved in regular Hadoop Cluster maintenance such as updating system packages.
- Experience in managing and analyzing Hadoop log files to look troubleshooting issues.
- Good knowledge in NoSQL databases, like HBase, MongoDB, etc.
- Proficient in providing on-job training for less experienced team members, to bring them up to speed in terms of technical expertise.
Environment: Hadoop, YARN, Hive, HBase, Linux, MapReduce, HDFS, MySQL
Confidential
System Analyst / Hadoop Admin
RESPONSIBILITIES:
- Responsible for doing capacity planning based on the data size requirements provided by end-clients.
- Participated in system configuration design to finalize on the master and slave configurations for the cluster.
- Experience in designing, implementing and maintaining of high performing Hadoop clusters and integrating them with existing infrastructure.
- Worked on analyzing Hadoop cluster and different big data analytic tools including HDFS, Hive, HBase, Flume, Oozie and Sqoop.
- Configured various property files like core-site.xml, hdfs-site.xml, mapred-site.xml and hadoop-env.xml based upon the job requirement.
- Experience managing users and permissions on the cluster, using different authentication methods.
- Involved in regular Hadoop Cluster maintenance such as patching security holes & updating system packages.
- Experience in doing performance tuning based on the inputs received from the currently running jobs.
- Handle the issues reported by the developers and clients.
- Preparing Standard Operating Procedures (SOPs) for the process related tasks handled by team members.
- Responsible for onsite-offshore coordination for smoother operations.
Environment: Hadoop, YARN, Hive, HBase, Linux, MapReduce, HDFS, Java, MySQL
Confidential
Technical Lead, ITIS
RESPONSIBILITIES:
- Administration of all the servers (CMS, GMS, DB and Web servers) for the environment supported.
- Administration of Interwoven teamsite on all the CMS (Content Management System) servers across the environment.
- Deployment of files to web servers using OpenDeploy.
- Deployment of content / data to DB servers using DataDeploy.
- Handling GMS (Globalization Management Server) Deployments for translation of websites to other languages to make it global.
- Validate and execute CMS & GMS deployments and take care of any issues in case they occur.
- Handle the issues reported directly by the business users.
- Fair knowledge in basic administration of “SQL Server Management Studio” and also executing SQL scripts in it.
- Preparing Standard Operating Procedures (SOPs) for the process related tasks handled by team members.
- Validating and auditing the deployments handled by the team members.
- Act as the first point of escalation for the issues reported by business users.
- Get the KT from the clients and train the team members on any new processes added by the clients.
- Micro management of the team including preparation of shift schedules and arranging for weekly team meetings.
- Driving and coordinating major production deployments.
- Ensuring that the agreed SLAs are met and investigating the causes if missed.
Environment: Interwoven Teamsite, BMC Remedy, SQL Server Management Studio, GMS
Confidential
Senior IS Analyst, Technical Support
RESPONSIBILITIES:
- Drive & attend the service review meetings with the stakeholders located offshore via video conferencing or conference calls.
- Preparing & Updating the Standard Operating Procedures (SOPs) for the process related tasks handled by the team members.
- Maintaining Average Handling Time (AHT) of incidents & requests to meet Service Level Agreement (SLAs)
- Generate weekly KPI reports (reports showing the inflow trend & SLA achievements) & Age analysis (reports showing the age of the existing tickets in Remedy) to be sent to the offshore stakeholders.
- Handling major incidents/outages and escalations.
- Providing ‘On the floor’ supervisor support to all the agents in handling calls & emails, related to process based tasks.
- Monitoring Remedy tickets raised by the team, to ensure adherence to the process & quality. And providing feedback for performance improvement.
- Creating, Enabling, Disabling & Moving user’s network accounts using Microsoft Active Directory (MS AD).
- Basic mainframe application troubleshooting, including new account creation, password resets, account unlocks & granting global access for the user’s mainframe Ids.
- Creating, Deleting & Moving User’s mailboxes in the Exchange servers using the Active Directory.
- Performing IP Telephone tasks including, creation of new extensions, modifying existing assignee for extensions, creating & modifying call pick up group, etc. using Cisco Call Manager.
- Documenting all the issues / requests in the ticketing tool named BMC Remedy using the Incident Management Console.
- Troubleshooting Windows related issues remotely using System Management Server (SMS) / Dameware.
Environment: BMC Remedy, Microsoft Active Directory, Mainframe, Microsoft Exchange Server, Cisco Call Manager, Dameware Remote Access