We provide IT Staff Augmentation Services!

Hadoop Kafka Admin Resume

5.00/5 (Submit Your Rating)

Oak Brook, IL

SUMMARY

  • Around 7 years of Hands on experience in deploying and managing multi - node development, testing and production of Hadoop Cluster wif different Hadoop components (HIVE, PIG, SQOOP, OOZIE, FLUME, HCATALOG, ZOOKEEPER, HBASE) using Cloudera Manager and Hortonworks Ambari.
  • In-depth knowledge of Hadoop Eco system - HDFS, Yarn, MapReduce, Hive, Hue, Sqoop, Flume, Kafka, Spark, Oozie, NiFi and Cassandra.
  • Experience on Ambari (Hortonworks) for management of Hadoop Ecosystem.
  • Experience in performing minor and major upgrades.
  • Experience in performing commissioning and decommissioning of data nodes on Hadoop cluster.
  • Strong knowledge in configuring Name Node High Availability and Name Node Federation.
  • Familiar wif writing Oozie workflows and Job Controllers for job automation - shell, hive, Sqoop automation.

TECHNICAL SKILLS

  • HDFS
  • Yarn
  • MapReduce
  • Hive
  • Hue
  • Sqoop
  • Flume
  • Kafka
  • Spark
  • Oozie
  • NiFi and Cassandra.

PROFESSIONAL EXPERIENCE

Confidential

Hadoop kafka Admin

Responsibilities:

  • Setting up and configuring Kafka Environment in Windows from teh scratch and monitoring it.
  • Created a data pipeline through Kafka Connecting two different clients Applications namely SEQUENTRA and LEASE ACCELERATOR
  • Implemented and managed for Devops infrastructure architecture, Terraform, Jenkins, Puppet and Ansible implementation, Responsible for CI infrastructure and CD infrastructure and process and deployment strategy.
  • Expertise in Ansible playbooks and AWX deployments.
  • Expert in using ANT scripts, Make and Maven for Build process. Experience in Implementation of Continuous Integration through Jenkins. Deployment using various CI Tools like Chef/Ansible.
  • Worked on creation of Ansible manifest files to install tomcat instances and to manage configuration files for multiple applications.
  • Worked on setting up 3 Instances in UAT/STAGING environment and 5 Instances in Production environment.
  • Responsible for building components to connect to other micro-services
  • Using Kafka, Elastic search, REST. Developed plugins to de-serialize data in non-native kafka
  • Environments. Developing Machine learning algorithms for internal search engine.
  • Maintaining and troubleshooting network connectivity
  • Manages Patches configuration, version control, service pack and reviews connectivity issues regarding security problem
  • Monitoring using teh ELK Stack me.e Elastic Search, Logstash and Kibana.
  • Knowledge about working wif On-Premise servers as well as Cloud Services Based Servers.
  • Hands-on experience in standing up and administrating on-premise Kafka platform.
  • Creating a backup for all teh instances in Kafka Environment.
  • Experience managing Kafka clusters both on Windows and Linux environment.
  • Involved in teh maintenance of Websphere MQ on different platforms and setting up teh development, testing and production environments.
  • Participate in all MQ administration, managing clusters both in GUI and command mode.
  • Participate in SSL Configuration on WebSphere MQ for security.
  • Experienced in upgrading WebSphere MQ on different platforms by applying Fix Packs and installed Support Packs.
  • Installed and configured WebSphere MQ Series on teh UNIX platform.
  • Apache ActiveMQ is teh most popular and powerful open source messaging and Integration Patterns provider. Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes wif easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4.
  • RabbitMQ is a “traditional” message broker that implements variety of messaging protocols. It was one of teh first open source message brokers to achieve a reasonable level of features, client libraries, dev tools, and qualty documentation. RabbitMQ was originally developed to implement AMQP, an open wire protocol for messaging wif powerful routing features. While Java has messaging standards like JMS, it’s not halpful for non-Java applications that need distributed messaging which is severely limiting to any integration scenario, microservice or monolithic. Wif teh advent of AMQP, cross-language flexibility became real for open source message brokers.Designed and implemented by configuring Topics in new Kafka cluster in all environment.
  • Knowledge of Kafka API.
  • Designed and implemented by configuring Topics in new Kafka cluster in all environment.
  • Exposure and Knowledge of managing streaming platform on cloud provider (Azure, AWS & EMC)
  • Efficiently Worked wif all of teh following tools/Instances but not limited to including: Kafka, Zookeeper, Console Producer, Console Consumer, Kafka Tool, File Beat, Metric Beat, Elastic Search, Logstash, Kibana, Spring Tool Suite, Apache Tomcat Server etc.
  • Operations - Worked on Enabling JMX metrics.
  • Operations - Involved wif data cleanup for JSON and XML responses that were generated.
  • Successfully secured teh Kafka cluster wif Kerberos Implemented Kafka Security Features using SSL and wifout Kerberos. Further wif more grain-fines Security me set up Kerberos to have users and groups dis will enable more advanced security features.
  • Integrated Apache Kafka for data ingestion
  • Successfully Generated consumer group lags from kafka using their API Kafka- Used for building real-time data pipelines between clusters.
  • Created POC for multiple use cases related to CBRE’s Homebuilt Application SEQUENTRA and client LEASE ACCELERATOR
  • Complete knowledge regarding Elasticsearch, Logstash and Kibana.
  • Installed Hadoop cluster and worked wif big data analysis tools including hive
  • Created and wrote shell scripts (kasha, Bash), Ruby, Python and PowerShell for setting up baselines, branching, merging, and automation processes across teh environments using SCM tools like GIT, Subversion (SVN), Stash and TFS on Linux and windows platforms.
  • Design, build and manage teh ELK (ElasticSearch, Logstash Kibana) cluster forcentralized logging and search functionalities for teh App.Responsible to designing and deploying new ELK clusters (Elasticsearch, logstash, Kibana,beats, Kafka, zookeeper etc.Installed Kerberos secured kafka cluster wif no encryption on Dev and Prod. Also set up Kafka ACL's into it
  • Successfully did set up a no autantication kafka listener in parallel wif Kerberos (SASL) Listener. Also me tested non autanticated user (Anonymous user) in parallel wif Kerberos user.
  • Installed Ranger in all environments for Second Level of security in Kafka Broker.
  • Involved in Data Ingestion Process to Production cluster.
  • Worked on Oozie Job Scheduler
  • Worked on Spark Transformation Process, RDD Operations, Data Frames, Validate Spark Plug-in for Avro Data format (Receiving gzip data compression Data and produce Avro Data into HDFS files).
  • Installed Docker for utilizing ELK, Influxdb, and Kerberos.
  • Involved in defining test automation strategy and test scenarios, created automated test cases, test plans and executed tests using Selenium WebDriver and JAVA. Architected Selenium framework which has integrations for API automation, database automation and mobile automation.
  • Executed and maintained Selenium test automation scriptb
  • Created Database on InfluxDB also worked on Interface, created for Kafka also checked teh measurements on Databases
  • Created a Bash Scripting wif Awk formatted text to send metrics to InfluxDB.
  • Enabled influxDB and Configured Influx database source into Grafana interface
  • Succeeded in deploying of ElasticSearch 5.3.0, Influx DB 1.2 on teh Prod machine in a Docker container.
  • Created a Cron Job those will execute a program that will start teh ingestion process. Teh Data is read in, converted to Avro, and written to teh HDFS files
  • Successfully Upgraded HDP 2.5 to 2.6 in all environment Software patches and upgrades.
  • Worked on Kafka Backup Index, Log4j appender minimized logs and Pointed ambari server logs to NAS Storage.
  • Deployed Data lake cluster wif Hortonworks Ambari on AWS using EC2 and S3.
  • Installed teh Apache Kafka cluster and Confluent Kafka open source in different environments.
  • Basically, one can install kafka open source or confluent version on windows and Linux/Unix systems.
  • Implemented real time log analytics pipeline using Confluent Kafka, storm, elastic search Logstash kibana, and greenplum.
  • We need to install jdk 1.8 or later and make accessible to teh entire box.
  • 3Download teh Apache kafka opensource and Apache zookeeper and start configuring in teh box where we want to run teh cluster. nce both kafka and zookeeper up and running, we will be able to create teh topics. Later we can produce and consume teh data. To make it secure, plugin teh security configuration wif SSL encryption, SASL Autantication and ACLs.
  • Finally, creating teh backup, adding clients, corgis, patch up and monitoring.
  • Intial design we can start wif single node or three node cluster and start adding teh nodes wherever requires.
  • Teh required features are CPU core:24, RAM memory:32/64 GB and Main Memory:500GB(least case) to 2 TB.
  • Basically usuage is for functional flow of data in parallel processing and distribute streaming platform.
  • Kafka replaces teh traditional pub-sub model wif ease, fault-tolerant, high thorughtput and low latency.
  • Installed and developed different POC's for different application/infrastructure teams both in Apache Kafka and Confluent open source for multiple clients.
  • Installing, monitoring and maintenance of teh clusters in all environments.
  • Installed single node-single broker and multi-node multi broker clusters and encrypted wif SSL/TLS, autanticate wif SASL/PLAINTEXT, SASL/SCRAM and SASL/GSSAPI (Kerberos).
  • Integrated topic-level security and teh cluster is full up and running for 24/7.
  • Installed Confluent Enterprise in Docker and kubernetes in a 18-node cluster.
  • Installed Confluent Kafka, applied security to it and monitoring wif Confluent control center.
  • Involved in clustering wif Cloudera and Hortonworks and not exposing zookeeper, provided teh cluster to end user using teh Kafka-connect to communicate.
  • Setup redundancy to teh cluster and using teh monitoring tools like yahoo-Kafka manager and setup performance tuning to get teh data in real time approach wifout any latency.
  • Supported and worked for teh Docker team to install Apache Kafka cluster in multimode and enabled security in teh DEV environment.
  • Worked on Disk space issues in Production Environment by monitoring how fast that space is filled, review wat is being logged created a long-term fix for dis issue (Minimize Info, Debug, Fatal Logs, and Audit Logs).
  • Installed Kafka manager for consumer lags and for monitoring Kafka metrics also dis has been used for adding topics, Partitions etc.
  • Successfully Generated consumer group lags from Kafka using their API
  • Successfully did set up a no autantication Kafka listener in parallel wif Kerberos (SASL) Listener. In addition, me tested non-autanticated user (Anonymous user) in parallel wif Kerberos user.
  • Installed Ranger in all environments for Second Level of security in Kafka Broker.
  • Involved in Data Ingestion Process to Production cluster.
  • Installed Docker for utilizing ELK, Influxdb, and Kerberos.
  • Installed Confluent Kafka open source and enterprise edition on Kubernetes using teh helm charts of 10-node cluster and applied security SASL/PLAIN and SASL/SCRAM and pointed teh cluster for outside access.
  • Designed and implemented by configuring Topics in new Kafka cluster in all environment.
  • Successfully secured teh Kafka cluster wif SASL/PLAINTEXT, SASL/SCRAM and SASL/GSSAPI (Kerberos).
  • Implemented Kafka Security Features using SSL and wifout Kerberos. Further, wif more grain-fines Security me set up Kerberos to have users and groups dis will enable more advanced security features.
  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hive and Sqoop. Created POC on Hortonworks and suggested teh best practice in terms HDP, HDF platform
  • Set up Hortonworks Infrastructure from configuring clusters to Node
  • Installed Ambari server on teh clouds
  • Setup security using Kerberos and AD on Hortonworks clusters/Cloudera CDH
  • Assign access to users by multiple users login.
  • Tested all services like Hadoop, ZK, Spark, Hive SERVER & Hive MetaStore.
  • Worked on SNMP Trap Issues in Production Cluster. Worked on heap optimization and changed some of teh configurations for hardware optimization.
  • Involved working in Production Ambari Views.
  • Implemented Rack Awareness in Production Environment.
  • Worked on Disk space issues in Production Environment by monitoring how fast that space is filled, review wat is being logged created a long-term fx for dis issue (Minimize Info, Debug, Fatal Logs, and Audit Logs).
  • Worked on Nagios Monitoring tool.
  • Installed Kafka manager for consumer lags and for monitoring Kafka Metrics also dis has been used for adding topics, Partitions etc.
  • Involved wif Hortonworks Support team on Grafana consumer Lags Issues. (Currently no consumer lags are generating in Grafana Visualization wifin HDP)
  • Successfully Generated consumer group lags from Kafka using their API

Confidential - Oak Brook IL

Hadoop Admin

Responsibilities:

  • Created stored procedures in MySQL Server to perform result-oriented tasks
  • Managed and reviewed Hadoop Log files as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
  • Responsible for designing highly scalable big data cluster to support various data storage and computation across varied big data cluster - Hadoop, Cassandra, MongoDB & Elastic Search.
  • Responsible for installing, configuring, supporting and managing of Cloudera Hadoop Clusters.
  • Installed Kerberos secured Kafka cluster wif no encryption on POC also set up Kafka ACL's
  • Created NoSQL solution for a legacy RDBMS Using Kafka, Spark, SOLR, and HBase indexer for ingestion SOLR and HBase for and real-time querying
  • Experienced in Administration, Installing, Upgrading and Managing distributions of Hadoop clusters wif MapR 5.1 on a cluster of 200+ nodes in different environments such as Development, Test and Production (Operational & Analytics) environments.
  • Troubleshooting issues in teh execution of MapReduce jobs by inspecting and reviewing log files.
  • Worked on implementing NOSQL database Cassandra cluster.
  • Extensively worked on Elastic search querying and indexing to retrieve teh documents in high speeds.
  • Installed, configured, and maintained several Hadoop clusters which includes HDFS, YARN, Hive, HBase, Knox, Kafka, Oozie, Ranger, Atlas, Infra Solr, Zookeeper, and Nifi in Kerberized environments.
  • Installed and configured Hadoop, Map Reduce, HDFS ( Hadoop Distributed File System), developed multiple Map Reduce jobs in java for data cleaning.
  • Experience in managing teh Hadoop cluster wif IBM Big Insights, Hortonworks Distribution Platform.
  • Regular Maintenance of Commissioned/decommission nodes as disk failures occur using MapR File
  • Experience in managing teh Hadoop cluster wif IBM Big Insights, Hortonworks Distribution Platform.
  • Responsible for installing, configuring, supporting and managing of Cloudera Hadoop Clusters.
  • Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration in MapR Control System (MCS).
  • Experience in innovative, and where possible, automated approaches for system administration tasks.
  • Worked on setting up high availability for major production cluster and designed automatic failover control using zookeeper and quorum journal nodes.
  • Used Spark Streaming to divide streaming data into batches as an input to spark engine for batch processing. Mentored EQM team for creating Hive queries to test use cases.
  • Sqoop configuration of JDBC drivers for respective relational databases, controlling parallelism, controlling distchache, controlling import process, compression codecs, importing data to hive, HBase, incremental imports, configure saved jobs and passwords, free form query option and trouble shooting.
  • Performed data blending of Cloudera Impala and TeraData ODBC data source in Tableau.
  • Collection and aggregation of large amounts of streaming data into HDFS using Flume Configuration of Multiple Agents, Flume Sources, Sinks, Channels, Interceptors defined channel selectors to multiplex data into different sinks and log4j properties
  • Responsible for implementation and ongoing administration of MapR 4.0.1 infrastructure.
  • Maintaining teh Operations, installations, configuration of 150+ node cluster wif MapR distribution.
  • Monitoring teh health of teh cluster and setting up alert scripts for memory usage on teh edge nodes.
  • Commissioned and decommissioned teh Data Nodes in teh cluster in case of teh problems.
  • Debug and solve teh major issues wif Cloudera manager by interacting wif teh Cloudera team.
  • To analyze data migrated to HDFS, used Hive data warehouse tool and developed Hive queries.
  • Monitor Hadoop cluster job performance and capacity planning.

Confidential - San Jose, CA

Hadoop Admin

Responsibilities:

  • Installation and configuration of Linux for new build environment.
  • Day-to- day - user access, permissions, Installing and Maintaining Linux Servers.
  • Created volume groups logical volumes and partitions on teh Linux servers and mounted file systems and created partitions
  • Experienced in Installation and configuration Cloudera CDH4 in testing environment.
  • Resolved tickets submitted by users, P1 issues, troubleshoot teh errors, resolving teh errors.
  • Validated web services manually and through groovy script automation using SOAP UI.
  • Implementing End to End automation tests by consuming teh APIs of different layers.
  • Involved in using Postman tool to test SOA based architecture for testing SOAP services and REST API.
  • Used Maven to build and run teh Selenium automation framework.
  • Framework used to send teh automation reports over email.
  • Validated web services manually and through groovy script automation using SOAP UI.
  • Implementing End to End automation tests by consuming teh APIs of different layers.
  • Balancing HDFS manually to decrease network utilization and increase job performance.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Done major and minor upgrades to teh Hadoop cluster.
  • Upgraded teh Cloudera Hadoop ecosystems in teh cluster using Cloudera distribution packages.
  • Use of Sqoop to Import and export data from HDFS to RDMS vice-versa.
  • Done stress and performance testing, benchmark for teh cluster.
  • Commissioned and decommissioned teh Data Nodes in teh cluster in case of teh problems.
  • Debug and solve teh major issues wif Cloudera manager by interacting wif teh Cloudera team.
  • Involved in estimation and setting-up Hadoop Cluster in Linux.
  • Prepared PIG scripts to validate Time Series Rollup Algorithm.
  • Responsible for support, troubleshooting of Map Reduce Jobs, Pig Jobs and maintaining Incremental Loads at daily, weekly and monthly basis.
  • Implemented Oozie workflows for Map Reduce, Hive and Sqoop actions.
  • Channelized Map Reduce outputs based on requirement using Practitioners
  • Performed scheduled backup and necessary restoration.
  • Build and maintain scalable data using teh Hadoop ecosystem and other open source components like Hive and HBase.

Confidential -Glendale, CA

Linux/Unix Administrator

Responsibilities:

  • Experience installing, upgrading and configuring RedHat Linux 4.x, 5.x, 6.x using Kickstart Servers and Interactive Installation
  • Responsible for creating and managing user accounts, security, rights, disk space and process monitoring in Solaris, CentOS and Redhat Linux.
  • Experience in writing Scripts in Bash for performing automation of various tasks.
  • Experience in writing Shell scripts using bash for process automation of databases, applications, backup and scheduling to reduce both human intervention and man hours.
  • Remote system administration via tools like SSH and Telnet
  • Extensive use of crontab for job automation.
  • Installed & Configured Selenium Web Driver, Test-NG, Maven tool and created Selenium automation scripts in java using Test-NG prior to next quarter release.
  • Developed Python Scripts (automation scripts) for stability testing.
  • Experience administering, installing, configuring and maintaining Linux
  • Creates Linux Virtual Machines using VMware Virtual Center dministers VMware Infrastructure Client 3.5 and Vsphere 4.1
  • Installs Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems
  • Installing Red Hat Linux 5/6 using kickstart servers and interactive installation.
  • Supporting infrastructure environment comprising of RHEL and Solaris.
  • Installation, Configuration, and OS upgrades on RHEL 5.X/6.X/7.X, SUSE 11.X, 12.X.
  • Implemented and administered VMware ESX 4.x 5.x and 6 for running teh Windows, Centos, SUSE and Red Hat Linux Servers on development and test servers.
  • Create, extend, reduce and administration of Logical Volume Manager (LVM) in RHEL environment.
  • Responsible for large-scale Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.

Confidential

Linux/Unix Administrator

Responsibilities:

  • Experience installing, upgrading and configuring RedHat Linux 4.x, 5.x, 6.x using Kickstart Servers and Interactive Installation
  • Responsible for creating and managing user accounts, security, rights, disk space and process monitoring in Solaris, CentOS and Redhat Linux
  • Performed administration and monitored job processes using associated commands Manages systems routine backup, scheduling jobs and enabling cron jobs
  • Maintaining and troubleshooting network connectivity
  • Manages Patches configuration, version control, service pack and reviews connectivity issues regarding security problem
  • Configures DNS, NFS, FTP, remote access, and security management, Server hardening
  • Installs, upgrades and manages packages via RPM and YUM package management
  • Logical Volume Management maintenance
  • Experience administering, installing, configuring and maintaining Linux
  • Creates Linux Virtual Machines using VMware Virtual Center
  • Administers VMware Infrastructure Client 3.5 and vSphere 4.1
  • Installs Firmware Upgrades, kernel patches, systems configuration, performance tuning on Unix/Linux systems
  • Puppet implementation and maintenance. Puppet manifests creation, testing and implementation.

We'd love your feedback!