Hadoop Admin Resume
Pittsburgh, PA
SUMMARY
- 10 years of professional IT experience.
- About 4.5 years of experience in deploying, maintaining, monitoring and upgrading Hadoop Clusters (Apache Hadoop, Cloudera, and HortonWorks).
- Two years of professional experience working with Linux Systems and 4 years of Experience in DW (Data Warehouse) Tools & BI (Business Intelligence).
- Innovative and self - directed individual with strong interpersonal skills, highly adaptable and quick to learn and adept at analyzing situations and taking initiative to solve problems.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, Map Reduce, YARN, Hive, Pig, Flume, Zookeeper, Sqoop, Oozie, Storm, Spark, Solr, Impala, CDH& HDP Distros.
Security Systems: MIT Kerberos, Apache Ranger, Apache Sentry
Mointoring Tools: Check mk, Nagios, Ganglia
Operating Sytems: Linux (Redhat, CentOS,Ubuntu), Windows (7, Vista, XP, 2003).
Languages: Python, PIG, SQL, PL/SQL, T-SQL, C, Core Java, Java Scripting, UNIX Shell Scripting, HTML, XML.
ETL Tools: Talend, AscentialDatastage 7.x/8.x, SSIS.
BI Tools/Analytics: Tableau, SSRS, OBIEE.
Databases: Oracle, Mysql, SQL Server, Teradata, HBASE, Opentsdb, Kairosdb.
Automation Tools: Puppet, Chef.
PROFESSIONAL EXPERIENCE
Confidential, Bloomfield, CT
Sr. Big Data Admin
Responsibilities:
- Build/Install, Configure & Manage of Hadoop Cluster using Cloudera Distribution
- Design, install, and maintain highly available systems (including monitoring, security, backup, and performance tuning).
- Integration of various third party technologies like SAP, tableau, BI tools like kyvos, atscale with hadoop environment.
- Analyzed system failures, identifying root causes, and recommended course of actions.
- Lead cloudera upgrades.
- Implemented commissioning and decommissioning of data nodes.
- Experienced in Real time data ingestion into HIVE using Spark.
- Experienced in -job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Expertise in debugging user requests and issues and resolving them in a timely manner.
- Administration of HBASE, Postgre sql and other no sql databases.
- Administrating user access to various applications like hue.
- Experience in Shell scripting.
- Managed add-on services like custom service descriptors.
- Identifying gaps and opportunities for the improvement of existing client solutions.
- Monitored workload, job performance and capacity planning for clusters.
- Production support.
Environment: Cloudera CDH 5.7, Cloudera Manager, Kerberos, Sentry, Postgresql, CentoOS, MapReduce, HDFS, Spark,Pig, Hive, Python, Sqoop, HBase, Zookeeper, Oozie, Hue, Kyvos, atscale.
Confidential, Pittsburgh, PA
Hadoop admin
Responsibilities:
- Provided Architectural design of Hardware configuration and deployment Diagrams.
- Configured Fully Distributed Hadoop cluster using bare metal Apache software.
- Introduced Hortonworks Platform Hadoop to the company, and implemented it from the scratch and documented.
- Expertise in integration of different tools in the Apache Hadoop Stack including MapReduce, Hive, Pig, Sqoop, HBase, Zookeeper and Oozie.
- Implemented Security, kerberos Authentication and also introduced Authorization with Ranger.
- Performed Manual Benchmarking tests to test the performance of the Hadoop cluster.
- Implemented Ganglia/Nagios monitoring setup on Apache Cluster.
- Performed tuning of configuration parameters of Yarn and Mapreduce, Hiveserver2, Hive metastore.
- Implemented few Automation scripts for cleaning disk space as well as backing up databases.
- Worked on installation of Mysql RDBMS databse replica.
- Worked closely with Devops and gave requirements for file system layout and OS requirements aand also kernel level parameter changes.
- Trained other team members by documenting the Installation process.
Environment: Apache Hadoop 2.6.0, Apache, Ambari-2.1.1, HDP-2.3.0, MIT Kerberos, Ranger, Mysql, CentoOS 6.6, MapReduce, HDFS, Pig, Hive-1.4, Sqoop, HBase-0.98, Zookeeper, Oozie, Tez.
Confidential, NJ
Sr. Hadoop consultant
Responsibilities:
- Managed and administered multiple clusters using cloudera manager.
- Performed upgrades from different Hadoop versions.
- Managed OS configuration with Puppet.
- Performed Hadoop performance metrics and tuning along with monitoring tools like check mk, icinga.
- Imported data frequently from Multiple RDBMS sources to HDFS using SQOOP
- Supported operations team in Hadoop cluster maintenance activities including commissioning and decommissioning nodes and upgrades.
- Added and removed cluster nodes as required.
- Monitored and troubleshoot Hadoop clusters using Ganglia and Nagios.
- Managed and reviewed Hadoop log files.
- Supported MapReduce Programs and distributed applications running on the Hadoop cluster.
- Successfully loaded files to hive and HDFS from LAMP servers.
- Prepared multi-cluster test harness to exercise the system for performance, failover and upgrades.
- Ensured data integrity using ‘fsck’ and another Hadoop system admin tools for block corruption.
- Benchmarked and tuned Hadoop cluster for performance.
- Enabled resource management using Capacity scheduler to various business units.
- Experience in providing security for Hadoop Cluster with Kerberos.
- Maintained User Provisioning across Clusters. Built automated user provisioning scripts to create users in LDAP & principals in KDC.
- Administering large scale Cloudera Hadoop environments build and support including design, cluster set up, performance tuning and monitoring in an enterprise environment
- System administration and programming skills such as storage capacity management, performance tuning.
- Setup, configuration and management of security.
- Automate cluster node provisioning and repetitive tasks.
Environment: Cloudera CDH 5.4.X Hadoop-2.6, CentoOS 6.6, MapReduce, HDFS, Pig, Hive-1.1, Sqoop, HBase, Zookeeper, Oozie, Shell Script.
Confidential, NC
Hadoop Administrator
Responsibilities:
- Support & Maintain Existing Hadoop Clusters, Fine Tune Configurations for optimum utilization of Cluster Resources.
- Build/Install, Configure & Manage of Hadoop Cluster using Cloudera Distribution.
- Troubleshoot & Help Developers in resolving connectivity issues to the Hadoop Clusters through tools (Talend, Tableau) etc.
- Auto configuring the cluster using cluster automation tools.
- Manage day to day Hadoop performance tuning and capacity analysis.
- Configured Fair Scheduler to provide service-level agreements for multiple users of a cluster.
- Experience with disaster recovery and business continuity practice in hadoop stack
- Expertise in typical system administration and programming skills such as storage capacity management, performance tuning.
- Proficient in shell and perl, python scripting
- Experience in setup, configuration and management of security for hadoop clusters.
- Manage and monitor Hadoop cluster andplatform infrastructure
- Automatecluster nodeprovisioning andrepetitive tasks.
- Manage Hadoop stack supportrun book.
- Responsible for cluster availability andavailable24x7 on call support.
- Support development and production deployments.
Environment: Cloudera CDH-4.5.X, Hadoop-2.0, MapReduce, HDFS, Pig, Hive-0.10, Sqoop, HBase, Zookeeper, Oozie, Shell Script.
Confidential, San Diego, CA
Linux Administrator
Responsibilities:
- Configuration of network equipment for newly build servers.
- Installation and configuration of Solaris Zones.
- Database administration on various platforms like Linux and Windows server.
- VERITAS and Sun cluster administration.
- Adding disks and creating slices and file system administration.
- NFS configuration and administration.
- Kernel re-configuration and parameter tuning.
- Managing upgrade of Linux and Solaris servers.
- Basic creation and troubleshooting of scripting like Shell.
Environment: RHEL 4.x, 5.5, 6.2, Windows 2007/2008/2010, Veritas Volume Manager (VxVM),, Puppet, VMware, IBM X Servers X3400, Apache 2.0, NAS, Oracle 10g, Oracle RAC, JBoss 5.x.
Confidential
Sr DW / BI Developer
Responsibilities:
- Understanding existing business model and customer requirements.
- Create Dimensional Model (Logical and Physical Model).
- Designed the ETL flow i.e. architectural design.
- Translated business requirements into ETL Parallel jobs that maximize object reuse, parallelism, and performance using Datastage.
- Implemented Auditing and Logging Mechanism.
- Involved in creating of complex SQL Queries.
- Developed complex stored procedures to create various reports.
- Performance tuning and testing on stored procedures, indexes and triggers.
- Involved in Report Design and Coding for reports using SSRS.
- Deployed reports, created report schedules and subscriptions.
- Managing and securing reports using SSRS.
Environment: InfoSphere Datastage7.5, Unix, Shell Script, SQL, PlSql, Oracle, MSSQL Server 2008, SSRS.
Confidential
DW/ BI Developer
Responsibilities:
- Designed the Target Schema definition and ETL Jobs using Data stage.
- Used DS Director to view logs and clears logs and validates the job.
- Mapping Data Items from Source Systems to the Target System.
- Participated in Projects review meetings.
- Tuning the performance of ETL jobs.
- Involved in creating Stored Procedures, views, tables, constraints.
- Generated reports from the cubes by connecting to Analysis server from SSRS.
- Designed and created Report templates, bar graphs and pie charts.
- Modify and enhance existing SSRS reports.
Environment: MS SQL Server Enterprise 2000, Infosphere Datastage 7.5, Oracle, XML, Unix Shell Script.