Cassandra / Hadoop Administrator Resume
Melville, NY
SUMMARY
- 7 years of professional experience including around 3 plus years in Big Data analytics asHadoopAdministrator and 4 years of Unix Administrator.
- Experience in configuring, installing, benchmarking and managing ApacheHadoop, Cloudera Hadoop, Hortonworks and MapR distributions.
- Experience in deploying scalableHadoopcluster on Cloud environment like Amazon AWS, Rack - Space and Amazon S3 and S3N as underlying file system forHadoop.
- Experience in installation, configuration, supporting and monitoringHadoopclusters using HDFS.
- Experience in designing and implementation of secureHadoopcluster using Kerberos.
- Experience onHadoopCDH3 and CDH4, CDH5 and also MapR.
- Experience in minor and major upgrades ofHadoopandHadoopeco system
- Experience in installing and monitoring theHadoopcluster resources using Ganglia and Nagios.
- Experience in managing multi-tenantCassandraclusters on public cloud environment - Amazon Web Services (AWS)-EC2.
- Experience in deploying and managing theHadoopcluster using Cloudera Manager.
- Experience in developing Map-Reduce programs and custom UDF's for data processing using Java.
- Expertise in implementing enterprise level security using AD/LDAP, Kerberos, Knox, Sentry and Ranger.
- Experience in Linux admin activities on RHEL & Cent OS.
- Experience in improving theHadoopcluster performance by considering the OS kernel, storage, Networking,HadoopHDFS and MapReduce by setting appropriate configuration parameters.
- Experience in setting up monitoring tools like Nagios and Ganglia forhadoop.
- Experience in importing and exporting the data from relational databases using SQOOP to HDFS and collecting web logs using flume to HDFS.
- Experience in using automation tools like Puppet, and Chef.
- Experience in deployingHadoop2.0(YARN).
- Experience in importing and exporting the logs using Flume. Optimizing performance of Hbase/Hive/Pig jobs.
- Experience in setting up the High-AvailabilityHadoopClusters, Experience in upgrading the existingHadoopcluster to latest releases.
- Experience in deployingHadoopcluster on Public and Private Cloud Environment like Amazon AWS, Rack Space.
- Setting up and maintaining NoSQL Databases like Cassandra.
- Experience in using Pig, Hive, Sqoop, Zookeeper and experienced in providing security toHadoopcluster with Kerberos.
- Experience is deploying, Cassandra cluster (Apache Cassandra, Datastax), Monitoring a Cassandra Cluster using OpsCenter.
- Experience in installing hive and configuring hive Meta store. Excellent skills at writing queries in Hive for the Map-Reduce and writing queries in Impala on top of Hive schema.
- Installation, patching, upgrading, tuning, configuring and trouble-shooting Linux based operating systems - Red Hat and Centos and virtualization in a large set of servers.
- Good Experience in setting up the Linux environments, Password less SSH, Creating file systems, disabling firewalls, swappiness, SELinux and installing Java.
- Excellent in communicating with clients, customers, managers, and other teams in the enterprise at all levels.
- Experience in Software Development Lifecycle (SDLC), application design, functional and technical specs, and use case development using UML.
- Excellent programming skills in writing/maintaining, Oracle and PL/SQL procedures, triggers, cursors, views, and complex SQL Queries.
TECHNICAL SKILLS
- HDFS
- Hadoop Cloudera
- Map Reduce
- Hortonworks
- Pig
- Hive
- Hbase
- Sqoop
- Zookeeper
- Oozie
- Hue
- HCatalog
- Storm
- Kafka
- Key Value Store Indexer and Flume.
- Hbase
- Cassandra
- Cloudera Impala
- Mongo DB.
- MySQL
- Oracle 8i/9i/10g
- SQLServer
- PL/SQL.
- HDP Ambari
- Cloudera Manager
- Hue
- SolrCloud.
- Shell scripting
- HTMLscripting
- Puppet
- Ansible.
- Apache Tomcat
- JBOSS and Apache Http web server.
- Net Beans
- Eclipse
- Visual Studio
- Microsoft SQL Server
- MS Office.
- Kerberos
- NagiOS & Ganglia
- Java
- HTML
- MVC
- Struts
- Hibernate
- Servlet spring
- Web services.
- Windows XP 7 8
- UNIX
- MAC
- MS DOS.
- MS Office
- MS Project
- MS Visio
- MS Visual Studio 2003/ 2005/ 2008.
PROFESSIONAL EXPERIENCE
Cassandra / Hadoop Administrator
Confidential - Melville, NY
Responsibilities:
- Installed and Configured DataStax OpsCenter forCassandraCluster maintenance and alerts.
- Responsible for Cluster configuration maintenance and troubleshooting and tuning the cluster.
- Collaborating with application teams to install operating system andHadoopupdates, patches, version upgrades when required. Point of Contact for Vendor escalation
- Cloudera Manager Up gradation from 5.3 to 5.5 versions.
- BuiltCassandraCluster on both the physical machines and on AWS
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce way
- Involved in requirements gathering and capacity planning for multi data center (four)Cassandracluster.
- Imported logs from web servers with Flume to ingest the data into HDFS.
- Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
- Commissioned and Decommissioned nodes on CDH5Hadoopcluster on Red hat LINUX.
- Expertise in Capacity Planning, Configuration, Operationalizing of small to medium sized BIGDATA Hadoop Clusters
- Responsible for building scalable distributed data solutions usingHadoop.
- Commissioned and Decommissioned nodes on CDH5Hadoopcluster on Red hat LINUX.
- Optimized theCassandracluster by making changes inCassandraconfiguration file and Linux OS configurations.
- Built & DeployedHadoopclusters with differentHadoopcomponents (HDFS, YARN, HBASE and ZOOKEEPER)
- Used Flume extensively in gathering and moving log data files from Application Servers to a central location inHadoopDistributed File System (HDFS).
- Installation and Configuration of other Open Source Software like Pig, Hive, HBASE, Flume and Sqoop
- Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
- Worked onCassandraData modelling, NoSQL Architecture, DSECassandraDatabaseExported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Implemented Capacity Scheduler to share the resources of the cluster for the map reduces jobs given by the users.
- Working asCassandraAdmin(Datastax DSE-DevOps-NoSQL DB) on 39 node cluster.
- Rack Aware Configuration, Configuring Client Machines Configuring, Monitoring and Management Tools
- Having knowledge on Installation and configuration ofClouderaHadoop on single or cluster environment.
- Worked on NoSQL databases including HBase, Mongo DB, and Cassandra. Implemented multi-data center and multi-rack Cassandra cluster.
- Partitioned and queried the data in Hive for further analysis by the BI team. Extending the functionality of Hive and Pig with custom UDF s and UDAF's.
- Worked on analyzingHadoopcluster and different big data analytic tools including Pig, Hbase database and Sqoop.
Environment: - Cloudera 4.3.2, HDFS, Cassandra2.1, AWS, Hive, Sqoop, Zookeeper and HBase, Windows 2000/2003, Unix Linux Java,Pig Hive HBase Flume Sqoop, NOSQL Oracle 9i/10g/11g RAC with Solaris/red hat, Big Data Cloud era CDH ApacheHadoop, Toad, MYSQL plus, Oracle Enterprise Manager (OEM), RMAN, Shell Scripting, RedHat/Suse Linux
Hadoop Admin
Confidential - Somerset, NJ
Responsibilities:
- Working on multiple projects spanning from ArchitectingHadoopClusters, Installation, Configuration and Management ofHadoopCluster.
- Integrated Oozie with the rest of theHadoopstack supporting several types ofHadoopjobs out of the box (like MapReduce, Pig, Hive, Sqoop) as well as system specific jobs.
- Responsible for Cluster maintenance, Adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups, Manage and reviewHadooplog files.
- Implemented theHadoopName-node HA services to make theHadoopservices highly available.
- Extensively used Hive/HQL or Hive queries to query or search for a particular string in Hive tables in HDFS.
- Supporting Site migration activities for WAS, PHP, STATIC withLAMPstack& MySQL, Oracle and other DB.
- Exporting data from RDBMS to Big Data tools HIVE, HDFS and HIVE, HDFS to RDBMS by using SQOOP.
- IntegratingHadoopcluster with Kerberos authentication for secured authentication & authorization of Hadoopcluster and monitored the connectivity
- Performance tuning for infrastructure andHadoopsettings for optimal performance of jobs and their throughput.
- Worked onCassandraData modelling, NoSQL Architecture, DSECassandraDatabase.
- Tested configuring Kerberos KDC (Key Distribution Center) and Slave KDC and adding multiple realms to distinguish eachHadoopcluster.
- Involved in migration of data from oneHadoopcluster to theHadoopcluster.
- Built & DeployedHadoopclusters with differentHadoopcomponents (HDFS, YARN, HBASE and ZOOKEEPER)
- Install setup and administrateIBMBigInsightshadoop clusters which have Apache components of Ambari 2.1, HDFS, YARN, Spark, Flume, Kafka, HBase, Hive, Knox, Oozie, Pig, Solr, Sqoop, Zookeeper, etc.
- Performed Benchmarking and performance tuning on theHadoopinfrastructure.
- Experience in designing and implementation of secureHadoopcluster using Kerberos.
- Collaborating with application teams to install operating system andHadoopupdates, patches, version upgrades when required. Point of Contact for Vendor escalation
- Experience with Cloudera and Horton Works to compare the functionalities withIBMBigInsights.
- Used Flume extensively in gathering and moving log data files from Application Servers to a central location inHadoopDistributed File System (HDFS).
- Wrote shell scripts to monitor the health check ofHadoopdaemon services and respond accordingly to any warning or failure conditions.
- Experience in managing and reviewingHadooplog files.
- Responsible for Installing, setup and Configuring Apache Kafka and Apache Zookeeper. Worked on setting upHadoopcluster for the Production Environment.
- Scheduled data pipelines for automation of data ingestion in AWS. Utilized AWS framework for content storage and Elastic Search for document search.
- Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
- Performance tuning ofHadoopclusters andHadoopMapReduce routines. ScreenHadoopcluster job performances and capacity planning.
- Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs.
- Responsible for building scalable distributed data solutions usingHadoop.
- Worked with the system engineering team for deployment process as well as expansion of the existingHadoopenvironment.
- Used Talend for BigData Integration to generate the native code to work withHadoopand Spark.
Environment: - HDFS, Map Reduce, Hive 1.1.0, Hue 3.9.0, Pig, Flume, Salt, Puppet, chef, Oozie, Sqoop, CDH5, ApacheHadoop2.6, Spark, SOLR, Storm, Knox, Impala, Red Hat, MySQL and Oracle.
Hadoop Administrator
Confidential - Michigan
Responsibilities:
- Responsible for implementation and ongoing administration ofHadoopinfrastructure.
- Installed/Configured/Maintained ApacheHadoopclusters for application development andHadooptools like Hive, Pig, HBase, Zookeeper and Sqoop.
- Configuration and installation ofHadoopwith Name-nodes High Availability and multi nodes clusters.
- Responsible for Cluster maintenance, commissioning and decommissioning Data nodes, Cluster Monitoring, Troubleshooting, Manage and review data backups, Manage & reviewHadooplog files.
- Set up automated processes to analyze the System andHadooplog files for predefined errors and send alerts to appropriate groups.
- Involved in implementing security on HortonworksHadoopCluster using with Kerberos by working along with operations team to move non secured cluster to secured cluster.
- Aligning with the systems engineering team to propose and deploy new hardware and software environments required forHadoopand to expand existing environments.
- Monitoring systems and services, architecture design and implementation ofHadoopdeployment, configuration management, backup, and disaster recovery systems and procedures.
- Work withHadoopdevelopers, designers in troubleshooting map reduce job failure issues and help the developers.
- Installed and configuredHadoopMap Reduce, HDFS.
- Developed Spark code using scala and Spark-SQL/Streaming for faster testing and processing of data.
- Set up automated processes to archive/clean the unwanted data on the cluster, in particular on Name node and Secondary name node.
- Experienced on DevOps tools like UrbanCode Deploy, Puppet, Chef, AWS, Azure
- Upgrade ofHadoop1.x to 2.x with no roll back (certify versions & patches) and software installation
- Implemented Commissioning and Decommissioning of data nodes, killing the unresponsive task tracker and dealing with blacklisted task trackers.
- Installed and configuredHadoopand Ecosystem components in Cloudera and Hortonworks environments.
- Teamed up with the infrastructure, network, database, application and business intelligence team to guarantee high data quality and availability.
- Worked with systems engineering team to plan and deploy newHadoopenvironments and expand existingHadoopclusters.
- Set up automated processes to analyze the System andHadooplog files for predefined errors and send alerts to appropriate groups.
- Efficiency in installing, configuring and implementing the LVM, and RAID Technologies using various tools like Veritas volume manager, Solaris volume manager.
- Configured and deployed hive metastore using MySQL and thrift server.
- Worked with the Linux administration team to prepare and configure the systems to supportHadoop deployment
- Created volume groups, logical volumes and partitions on the Linux servers and mounted file systems on the created partitions.
- Supported technical team members in management and review ofHadooplog files and data backups.
- Involved in implementing High Availability and automatic failover infrastructure to overcome single point of failure for Namenode utilizing zookeeper services.
- Work with Linux serveradminteam in administering the server hardware and operating system
- Periodically reviewedHadooprelated logs and fixing errors and preventing errors by analyzing the warnings
Environment: - Big Data, HDFS, Pig, Devops, Hive, Yarn, Storm, HBase, MapReduce, Sqoop, Flume, Zookeeper, Hortonworks, Eclipse, MYSQL, UNIX Shell Scripting.
Hadoop Admin
Confidential - Herndon, VA
Responsibilities:
- Build and maintain scalable data pipelines using theHadoopecosystem and other open source components like Hive and HBase.
- Working with data delivery teams to setup newHadoopusers such as setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig, Spark and MapReduce access for the new users
- Worked extensively with Amazon Web Services and Created Amazon Elastic Map Reduce cluster in both 1.0.3 and 2.2.
- Installed and configuredHadoopand responsible for maintaining cluster and managing and reviewingHadooplog files.
- Set up automated processes to analyze the System andHadooplog files for predefined errors and send alerts to appropriate groups.
- Balancing HDFS manually to decrease network utilization and increase job performance.
- MonitoringHadoopCluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics and Charge Back customers on their Usage.
- Participated in development and execution of system and disaster recovery processes.
- Wrote the shell scripts to monitor the health check ofHadoopdaemon services and respond accordingly to any warning or failure conditions.
- Set up automated processes to analyze the System andHadooplog files for predefined errors and send alerts to appropriate groups.
- Installation, patching, upgrading, tuning, configuring and troubleshooting Linux based operating systems RedHat and Centos and virtualization in a large set of servers.
Environment: - CDH 4/5, Hive, Sqoop, Pig, Flume, Zookeeper, Falcon, Oozie, Tez, RHEL 5.6/ Ubuntu 12.04/CentOS 6.4, MongoDB.
Linux Admin
Confidential
Responsibilities:
- Configuration and administration of DNS, LDAP, NFS, NIS, NIS+ and Send mail on RedHat Linux/Debian Servers.
- Hands on experience working with production servers at multiple data centers.
- Involved in writing scripts to migrate consumer data from one production server to another production server over the network with the help of Bash and Perl scripting.
- Installed and configured monitoring tools Munin and NagiOS for monitoring the network bandwidth and the hard drives status.
- Good experience in installation/up gradation of VMware. Automated server building using System Imager, PXE, Kickstart and Jumpstart.
- Planning, documenting and supporting high availability, data replication, business persistence, and fail-over, fail-back using Veritas Cluster Server in Solaris, RedHat Cluster Server in Linux and HP Service Guard in HP environment.
- Configured Global File System (GFS) and Zetta byte File System (ZFS). Troubleshooting production servers with IPMI tool to connect over SOL.
- Established and maintained network users, user environment, directories, and security.
- Configured system imaging tools Clonezilla and System Imager for data center migration. Configured yum repository server for installing packages from a centralized server.
Environment: - RHEL 5.x/4.x, Solaris 8/9/10, Nagios, Sun Fire, IBM blade servers, Web sphere 5.x/6.x, iPlanet, Oracle 11g/10g/9i, Logical Volume Manager, Veritas net backup 5.x/6.0, SAN Multipathing, VM ESX 4.1.
Linux System Administrator
Confidential
Responsibilities:
- Installing, configuring and upgrading Linux (Primarily REDHAT and UBUNTU) and Windows Servers.
- Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
- Installing and partitioning disk drives. Creating, mounting and maintaining file systems to ensure access to system, application and user data.
- Co-coordinating with the customers vendors for any system up gradation and giving the exact procedure to follow up.
- Creating users, assigning groups and home directories, setting quota and permissions; administering file systems and recognizing file access problems, Allocated disk space using EMC disk storage.
- Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administration tasks like cron jobs, installing packages, and patches.
- Performed configuration and troubleshooting of services like NFS, NIS, NIS+, DHCP, FTP, LDAP, Apache Web servers.
- Updating YUM Repository and Red hat Package Manager (RPM).
- Setup LANs and WLANs; Troubleshoot network problems. Configure routers and access points.
- Troubleshooting and fixing the issues at User level, System level and Network level by using various tools and utilities. Schedule backup jobs by implementing cron job schedule during non-business hour.
Environment: - Red Hat Linux 3.9/4.5-4.7, Windows 2000/NT 4.0, Apache 1.3.36, 1.2, 2.0, IIS 4.0 and Oracle 8i, bash shell, Samba, DNS, APACHE, Putty, WinScp.
System Administrator
Confidential
Responsibilities:
- Installing, configuring and upgrading Linux (Primarily REDHAT and UBUNTU) and Windows Servers.
- Co-coordinating with the customers vendors for any system up gradation and giving the exact procedure to follow up.
- Creating users, assigning groups and home directories, setting quota and permissions; administering file systems and recognizing file access problems.
- Installed Cent OS using Pre-Execution environment boot and Kick start method on multiple servers, remote installation of Linux using PXE boot.
- Installed and verified that all AIX/Linux patches are applied to the servers.
- Responsible for maintenance Raid-Groups, LUN Assignments as per agreed design documents. Performed all System administration tasks like cron jobs, installing packages, and patches.
- Configuring multipath, adding SAN and creating physical volumes, volume groups and logical volumes.
- Maintenance and installation of RPM and YUM package installations and other server management.
- Develop and optimize physical design of MySQL database systems.
- Implemented new releases to add more functionality as per the requirements
- Managing Disk File Systems, Server Performance, Users Creation and Granting file access Permissions and RAID configurations.
- Involved in tracking and resolving Production issues.
- Performed various configurations which include networking and IPTables, resolving hostnames, SSH key less login.
- Performed RPM and YUM package installations, patch and other server management.
- Contact various systems administration works under CentOS, Red Hat Linux environments.
- Configured Domain Name System (DNS) for hostname to IP resolution.
Environment: - YUM, RAID, MYSQL 5.1.4, PHP, SHELL SCRIPT, MYSQL, WORKBENCH, LINUX 5.0, 5.1, YUM, RAID.