Hadoop Administrator Resume
SUMMARY
- Highly Skilled Hadoop Administrator with 12+ years of experience in IT industry.
- Possess extensive knowledge of multiple scripting and programming languages and excellent analytical and problem solving skills. Possess Masters in Information Systems.
- 2+ years of experience in Hadoop administration and big data technologies, with exposure in Linux administration and as a Deployment engineer in various domains
- Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Cloudera distribution (CDH 5.X), Hortonworks (HDP 2.X) and Azure HDInsight (HDI 3.5)
- Supported and maintained Hadoop ecosystem components like Hadoop HDFS, YARN, MapReduce, HBase, Oozie, Hive, Sqoop, Pig, Flume, KTS, KMS, Sentry, SmartSense, Storm, Kafka, Ranger, Falcon and Knox
- Exposure to work with Hadoop clusters on Public Cloud Environment like AWS and Rackspace
- Experienced in Cloud based deployments on Azure - HDInsight with varied workloads and customized clusters
- Experienced in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting
- Experienced in setting up High Availability (HA) for various services in the ecosystem
- Experienced in setting up backup and recovery policies to ensure the High Availability (HA) of clusters
- Strong understanding of Hadoop security concepts and implementations - Authentication (Kerberos/AD/LDAP/KDC), Authorization (Sentry/Ranger) and Encryption (KTS/KMS)
- Monitored Workload job performance and capacity planning using Cloudera Manager/ Ambari
- Involved in enforcing standards and guidelines for the big data platform during production move
- Responsible for onboarding new tools/technologies onto the platform
- Experienced in analyzing Log files for Hadoop ecosystem services & analyze cause using Splunk
- Experienced with various Tools like Clarify, JIRA, ServiceNow, New Relic, Splunk, Serena Team Track and Test Management Tools like Rational suite of Products, MS Test Manager, HP ALM
- Closely work with different team such as Application team, networking, Linux system administrators and Enterprise service monitoring team
- Excellent Communication, Analytical, Interpersonal, Presentation and Leadership Skills
- Excellent in problem resolution and Root cause Analysis & techniques
- Good team player with an excellent communication & Client interface skills
TECHNICAL SKILLS
BIG Data Ecosystem: HDFS, MapReduce, Spark, Pig, Hive, Hbase, sqoop, zookeeper, Sentry, Ranger, Storm, Kafka, Oozie, flume, Hue, Knox
BIG Data Security: Kerberos, AD, LDAP, KTS, KMS, Redaction, Sentry, Ranger, Navencrypt, SSL/TLS, Cloudera Navigator
BIG Data Cloud: Azure HDInsights, AWS EMR
Languages/Scripting: Java, VBScript, Shell, Batch, SQL,. NET, JavaScript
Automation Tools: Selenium 1.x, 2.x (IDE, RC, Grid, Web Driver), Rational Robot, QTP
Other Tools: GitHub, Maven, Clarify, JIRA, Quality Center, Rational Suite of Products, MS Test Manager, TFS, Jenkins, Confluence, Splunk, NewRelic
Operating Systems: Windows, Linux, Unix, Mac
Databases: Oracle (9i, 10g), MS SQL Server, MySQL
Tools: MS Office, Cygwin, Ant, TestNG, Hudson, SVN, Eclipse
Mainframe: DB2. CICS. Expeditor, QMF,Platinum, FileAid
PROFESSIONAL EXPERIENCE
Confidential
Hadoop Administrator
Responsibilities:
- Maintained and supported Cloudera Hadoop clusters in production, development & sandbox environments
- Experienced in installation, upgradation and managing Cloudera distribution Hadoop cluster
- Managed and reviewed Hadoop and other ecosystem log files using Splunk
- Provisioning, installing, configuring, monitoring, and maintaining HDFS, YARN, HBase, Flume, Sqoop, Sentry, Oozie, Pig, Hive, Kafka
- Involved in resolving and troubleshooting cluster related issues
- Installed new services and tools using cloudera packages as well parcels
- Implemented authentication using Kerberos with AD/LDAP on Cloudera Hadoop Cluster
- Implemented authorization and enforced security policies using Apache Sentry
- Implemented encryption for data at rest using KTS/KMS and Navencrypt
- Implemented log level masking of sensitive data using redaction
- Involved in upgradation of Cloudera Manager and CDH
- Resolved node failure issues and troubleshooting of common Hadoop cluster issues
- Derived operational insights across the cluster for identifying anomalies and health check-up
- Configured alert mechanism using SNMP traps for Cloudera Hadoop distribution
- Involved in onboarding new users and use cases onto the CDH platform
- Involved in deploying use cases onto production environment
- Worked with development teams to triage issues and implement fixes on Hadoop environment and associated applications
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
- Governed data on the CDH cluster using Navigator for continuous optimization, audit, metadata management and policy enforcement
- Provisioned various HDInsights clusters for differentiated workloads (Spark, H20, R, HBase on Azure portal)
- HDInsights + H2O Cluster provisioned for DataScience prediction modeling.
- Customized DataScience based HDInsights cluster built with all requisite python libraries
- Explored template based HDInsights Cluster deployments for commission and decommissioning.
- Explored runbook based HDInsights cluster deployments for automated commissioning and decommissioning.
- Explored encryption on Azure WASB storage (Data at Rest)
- Explored Azure Data Lake Storage (ADLS)
Environment: CDH 5.x, Azure HDInsights 3.5/3.6, Azure H2O, Map Reduce, Hive, Pig, Zookeeper, HBase, Flume, Sqoop, Cent OS
Confidential
Hadoop Administrator
Responsibilities:
- Responsible for support, implementation and ongoing administration of Hadoop infrastructure on HDP platform
- Co-ordinate with the use case teams for new tools and services for deployment on the HDP platform
- Involved in setting up kerberos security for authentication
- Involved in setting up and configuring apache ranger for authorization
- Cluster maintenance as well as creation and removal of nodes using Ambari
- Performance tuning of Hadoop clusters and jobs related to Hive and Spark
- Monitor Hadoop cluster connectivity and security
- Collaborated with application teams like LSA and Networking teams to install operating system and Hadoop updates, patches, version upgrades when required
- Involved in upgrading of Ambari and HDP
- Involved in onboarding of new users and use cases to the HDP Platform
- Involved in enabling HA on various services on HDP
- Planning on requirements for migrating users to production beforehand to avoid last minute access issues
- Planning and implementation of data migration from existing development to production cluster
- Installed and configured Hadoop ecosystem components like PySpark, Hive, Sqoop, ZooKeeper, Oozie, Ranger, Falcon
- Prepared multi-cluster test harness to exercise the system for performance, failover and upgrades
- Involved in migrating data from production to development clusters using distcp
- Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time
- Continuous monitoring and managing the Hadoop cluster through Ganglia and Nagios.
Environment: HDP 2.5, HDFS, Mapreduce, Yarn, Hive, Pig, Flume, Oozie, Sqoop, Ambari.
Confidential
Quality Assurance & Engineering Lead
Responsibilities:
- Effort estimation, Preparation of Master Test Plan
- Review of Test Plan and Test approach documents.
- Understanding the functional requirements
- Preparing the Functional understanding document.
- Review of the functional understanding document and getting sign off from the client.
- Co-ordination/ Communicating with testing and development team/ Business for all Technical/ Project related issues.
- Environment Setup and Test lab
- Develop Test Scenarios and Test Case Development
- Review of Test Cases and Other Artifacts.
- Develop Automated Test cases using Selenium/ Rational.
- Execute the automated test cases
- Test Case execution Functional, Regression, Integration test cases
- Defects logging, tracking and analysis.
- Testing the Application as per the Business requirement of PDMA/FDA and 21 CFR Part11
- Ensuring Process Adherence against the defined process.
- Collection of QA Metrics and appraising to key stake holders
- Managing and submitting the final deliverables such as QA Binder (Test Summary Report, Traceability Matrix, and Metrics).
- Closely work with the release management team for Gate entry (Release meetings)
- Manage offshore team and allocating tasks and monitoring