Hadoop Adm Resume
Richmond, VA
SUMMARY:
- An experienced and a responsible System administrator and Hadoop administrator with a great ability to support servers, applications and Hadoop eco - system components in the existing cluster project seeking to achieve meaningful employment with a company which will allow me to address and solve their technical needs, by applying my experience within the areas of my profession and the ability to complete deadline-driven projects.
- 7 years of IT experience which includes 5+ years of experience withHadoop, HDFS, MapReduce andHadoop Ecosystem (Pig, Hive, HBase, Oozie, Sqoop).
- Experience in Installation, Configuration, Testing, Backup, Recovery, Customizing and Maintenance of clusters using Apache Hadoop and cloudera hadoop.
- Experience in using Splunk to load logs files into HDFS and Experience in file conversation formats, compression formats.
- Experience in capacity planning and analysis forHadoopinfrastructures/clusters
- Performed Importing and exporting data into HDFS and Hive using Sqoop.
- Hadoop cluster integration with Nagios and Ganglia
- Having strong experience/expertise in different BI tools like Cognos, Microstrategy, Tableau and Relational Database systems like Oracle/PL/SQL, Unix Shell scripting.
- Expertise with the tools inecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Spark, Kafka, Yarn, Oozie, and Zookeeper.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in designing both time driven and data driven automated workflows using Oozie
- Optimizing performance of Hbase/Hive/Pig jobs.
- Hands on experience in Zookeeper and ZKFC in managing and configuring in Name Node failure scenarios.
- Experience in understandinghadoopmultiple data processing engines such as interactive SQL, real time streaming, data science and batch processing to handle data stored in a single platform in Yarn.
- Experience in adding and removing the nodes in Hadoop cluster and experience in managing the Hadoopcluster with IBM Big Insights, HDP.
- Experience in integration of various data sources like Oracle, DB2, Sybase, SQL server and MS access and non-relational sources like flat files into staging area
- Experience in Data Analysis, Data Cleansing (Scrubbing), Data Validation and Verification, Data Conversion, Data Migrations and Data Mining.
- Having Strong Experience in LINUX/UNIX Administration, expertise in Red Hat Enterprise Linux 4, 5 and 6, familiar with Solaris 9 &10 and IBM AIX 6
- Strong experience in System Administration, Installation, Upgrading, Patches, Migration, Configuration, Troubleshooting, Security, Backup, Disaster Recovery, Performance Monitoring and Fine-tuning on Linux (RHEL) systems.
- Installing, upgrading and configuringLinuxServers using Kickstart as well as manual installations and the recovery of root password.
- Experience in Creation and managing user accounts, security, rights. Disk space and process monitoring in RedHatLinux
- Experience in Shell scripting (bash, ksh) to automate system administration jobs.
- Utilize industry standard tools for system management with emphasis on SSH/SCP/SFTP.
- User/ File management; Adding, removing and giving access rights to users on a server. Changing permissions, ownership of files and directories, and assigning special privileges to selected users and scheduling system related crone jobs.
TECHNICAL SKILLS:
Languages: Java, shell, Python, PowerShell
Databases: My SQL, SQL, Mango DB, Cassandra, Oracle
Methodologies: Agile, Waterfall
Hadoop ecosystem: HDFS, MapReduce, Hive,pig, Sqoop, HBase, Knox, Ranger, Zookeeper, Kafka, Splunk, Flume, Oozie, Spark
Operating Systems: RHEL, Linux, Windows, CentOS, Ubuntu, SUSE Solaris, Mac
Web/App Servers: Web Logic, Web Sphere, Jboss, Microsoft Azure,Apache, Tomcat, TFS, IIS, Nginix
Networks: NIS,NIS+,DNS,DHCP,TELNET,FTP,Rlogin
Network Protocols: TCP/IP,PPP,SNMP,SMTP,DNS,NFSv2,NFSv3
PROFESSIONAL EXPERIENCE:
Hadoop Administrator
Confidential -Richmond, VA
Responsibilities:
- Experienced as admin in Mapr (Mpr 1.2.00) distribution for 6 clusters ranges from POC to PROD.
- Implemented and Configured High Availability hadoop Cluster (Quorum Based).
- Involved in managing and reviewing hadoop log files.
- Implemented Fair scheduler on the job tracker to share the resources of the cluster for the Map reduce jobs given by the users.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Hands on experience working on hadoope co system components like HDFS, Map Reduce, YARN, Zookeeper, Pig, Hive, Sqoop, Flume.
- Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
- Effectively used oozie workflow engine to run multiple Hive and Pig jobs.
- Implemented rack aware topology on the hadoop cluster.
- Experience in using Flume to stream data into HDFS from various sources.
- Responsible for troubleshooting issues in the execution of Map Reduce jobs by inspecting and reviewing log files.
- Implemented Kerberos for authenticating all the services in hadoop Cluster.
- Experience in configuring Zookeeper to coordinate the servers in clusters to maintain the data consistency.
- Created HBase tables to store various data formats of data coming from different portfolios.
- Used Cloudera manager for installation and management ofhadoopcluster.
- Worked with application teams to install operating system,hadoopupdates, patches, version upgrades as required.
- Worked in the cluster disaster recovery plan for the hadoop cluster by implementing the cluster data backup.
- Involved in Commissioning and Decommissioning of nodes depending upon the amount of data.
- Automated the work flow using shell scripts.
- Performance tuning of the hive queries, written by other developers.
- Installed and maintain puppet-based configuration management system.
- Excellent troubleshooting skills Confidential production level issues in the cluster and its functionality.
- Monitored workload, job performance and capacity planning using cloud era manager.
Environment: Hadoop, Map reduce, YARN, Pig, Hive, HBase, Cassandra, Oozie, Zookeeper, HDFS, Sqoop, Flume, Spark, Kafka, Cloudera, Linux.
Hadoop Administrator
Confidential - Chicago,IL
Responsibilities:
- Specifying the Cluster size, allocating Resource pool and monitoring of jobs
- Configured the Hive set up
- Export the result set from one SQL server to another MySQL using Sqoop.
- Helped in the HIVE queries for the analysts.
- Helped the team to increase Cluster from 25 Nodes to 40 Nodes. The configuration for additional Data Nodes was managed through Serengeti.
- Maintain System integrity of all sub-components across the multiple nodes in the cluster.
- Monitor Cluster health and clean up logs when required.
- Perform upgrades and configuration changes
- Upgrading theHadoop Cluster from CDH3 to CDH4 and setup High availability Cluster Integrate the HIVE with existing applications
- Commission/decommission Nodes as needed.
- Manage resources in a multi-tenancy environment.
- Configured the Zookeeper in setting up the HA Cluster
- Implemented Fair schedulers on the Job tracker to share the resources of the Cluster for the Map Reduce jobs given by the users.
- Set up the compression for different volumes in the cluster.
- Developed Map Reduce programs to perform analysis research, identify and recommend technical and operational improvements resulting in improved reliability efficiencies in developing the Cluster.
- Wrote some Map reduce jobs for benchmark tests and automated them in a script.
Hadoop Administrator
Confidential - Dallas,TX
Responsibilities:
- Responsible for loading the customer's data and event logs from Kafka into HBase using REST API.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
- Worked on debugging, performance tuning and Analyzing data usingHadoopcomponents Hive & Pig.
- Created Hive tables from JSON data using data serialization framework like AVRO
- Implemented generic export framework for moving data from HDFS to RDBMS and vice-versa.
- Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
- Created Hive External tables and loaded the data in to tables and query data using HQL.
- Wrote shell scripts for rolling day-to-day processes and it is automated.
- Worked on loading data from LINUX file system to HDFS.
- Created HBase tables to store various data formats of PII data coming from different portfolios Implemented Map-reduce for loading data from oracle database to NoSQL database.
- Used Cloudera Manager for installation and management ofHadoopCluster.
- Moved data fromHadoopto Cassandra using Bulk output format class.
- Importing and exporting data into HDFS and Hive using Sqoop
- Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
- Responsible for processing unstructured data using Pig and Hive.
- Adding nodes into the clusters & decommission nodes for maintenance.
- Developed Pig Latin scripts for extracting data
- Extensively used Pig for data cleansing and HIVE queries for the analysts
- Created PIG script jobs in maintaining minimal query optimization.
- Worked on various Business Object Reporting functionalities such as Slice and Dice, Master/detail, User Response function and different Formulas.
- Strong experience on Apache server configuration
Environment: Hadoop, HDFS, HBase, Pig, Hive, Oozie, MapReduce, Sqoop, Cloudera, Cassandra, Kafka, LINUX, Java APIs, Java collection, Windows.
Hadoop Administrator
Confidential - Seattle, WA
Responsibilities:
- Supported Map Reduce Programs those are running on the cluster. involved in using Pig Latin to analyze the large scale data.
- Involved in loading data from UNIX file system to HDFS
- Interacted with business users on regular basis to consolidate and analyze the requirements and presented them with design results.
- Involved in data visualization and provided the files required for the team by analyzing the data in hive and developed Pig scripts for advanced analytics on the data
- Created many user-defined routines, functions, before/after subroutines which facilitated in implementing some of the complex logical solutions.
- MonitoringHadoopscripts which take the input from HDFS and load the data into Hive.
- Worked on improving the performance by using various performance tuning strategies.
- Managed the evaluation of ETL and OLAP tools and recommended the most suitable solutions depending on business needs.
- Migrated jobs from development to test and production environments.
- Created external tables with proper partitions for efficiency and loaded the structured data in HDFS resulted from MR jobs
- Involved in moving all log files generated from various sources to HDFS for further processing.
- Used Shell Scripts for loading, unloading, validating and records auditing purposes.
- Used Teradata Aster bulk load feature to bulk load flat files to Aster.
- Shell Scripts are also used for file validating, records auditing purposes.
- Used Aster UDFs to unload data from staging tables and client data for SCD which resided on Aster database.
- Extensively used SQL and PL/SQL for development of Procedures, Functions, Packages and Triggers
Environment: Java, SQL, PL/SQL, Unix Shell Scripting, XML, Teradata Aster, Hive, Pig,Hadoop, MapReduce, Clear Case, HP Unix, Windows XP professional.
Linux System Administrator
Confidential - CA
Responsibilities:
- Created, clonedLinux Virtual Machines, templates using VMware Virtual Client 3.5 and migrating servers between ESX hosts.
- Managed systems routine backup, scheduling jobs, enabling Cron jobs, enabling system logging and network logging of servers for maintenance.
- Performed RPM and YUM package installations, patch and other server management.
- Installed and configured Logical Volume Manager - LVM and RAID.
- Documented all setup procedures and System related Policies (SOP's).
- Provided 24/7 technical support to Production and development environments.
- Administrated DHCP, DNS, and NFS services inLinux.
- Created and maintained user's accounts, profiles, security, rights disk space and process monitoring.
- Provided technical support by troubleshooting Day-to-Day issues with various Servers on different platforms.
- Diagnose, solve and provide root cause analysis for hardware and O/S issues
- Run prtdiag -v to make sure all memory and boards are online, check for failure
- Supported Linux and Sun Solaris Veritas clusters.
- Notify server owner if there was a failover or crash. Also notify Unix Linux Server Support L3
- Check for core files, if exist send to Unix Linux Server Support for core file analysis.
- Monitor CPU loads, restart processes, check for file systems.
- Installing, Upgrading and applying patches for UNIX, Red Hat/ Linux, and Windows Servers in a clustered and non-clustered environment.
- Helped and installed system using kickstart
- Installation & maintenance of Windows 2000 & XP Professional, DNS and DHCP and WINS for the Bear Stearns DOMAIN.
- Use LDAP to authenticate users in Apache and other user applications
- Remote Administration using terminal service, VNC and PCAnywhere.
- Create/remove windows accounts using Active Directory
- Reset user password with Windows Server 2003 using Ds mod command-line tool
- Provided end-user technical support for applications
- Maintain/Create and update documentation