Elk Stack Sme Resume
Sunnyvale, CA
SUMMARY:
- Around 7+ years of professional experience in full life cycle system development and administration in which 3.5 Years of experience in ELK Stack.
- Experienced in working with Pig, Hive, Sqoop and Map Reduce.
- Extensive experience with Hbase and flume.
- Experience installing and developing on ELK.
- Design of highly loaded, distributed, horizontally scalable, low latency applications, cloud - based deployment and operations. implementation of extremely large distributed systems as well as areas related to as data storage, query optimization, JVM performance optimization, security, machine learning etc.
- Worked on Spark-Streaming using Scala.
- Have knowledge on Jenkins and Git tools.
- Worked on Tableau dashboard
- Integrated sql with Tableau
- Good experience working with Hortonworks Distribution, Map R and Cloudera Distribution.
- Created a 360 degree view of customer data for a financial client in a Hadoop data lake
- Implemented Hadoop muti node cluster on a AWS storage
- Worked on different ETL processes for the data ingestion module.
- Extensive experience with ETL technologies, such as Informatics.
- Experience in end to end design, development, Maintenance and Analysis of various types of applications using efficient Data Science Methologies and Hadoop ecosystem tools.
- Experience in providing Solution Architecture for Big Data projects using Hadoop Eco System.
- Have developed Cascading applications on Hadoop that integrate with Teradata.
- Experience in Linux shell Scripting.
- Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
- Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review.
- Familiar with data architecture including data ingestion pipeline design, Hadoop Information architecture, data modeling and data mining, machine learning and advanced data processing.
- Database / ETL Performance Tuning: Broad Experience in Database Development including effective use of Database objects, SQL Trace, Explain Plan, Different types of Optimizers, Hints, Indexes, Table Partitions, Sub Partitions, Materialized Views, Global Temporary tables, Autonomous Transitions, Bulk Binds, Capabilities of using MS SQL Built-in Functions. Coding of Database objects like Triggers, Procedures, Functions and Views.
- Performance Tuning of Informatics Mapping and workflow.
- Exposure to T-SQL programming and Architecture, and translated complex legacy process with T-SQL procedure, functions and package.
- Knowledge of working with star schema & Snow-Flake Schema.
- Excellent interpersonal and strong analytical, problem-solving skills with customer oriented attitude.
- A very good team player, self-motivated, dedicated in any work environment.
TECHNICAL SKILLS:
Operating Systems: Linux Confidential, Linux CentOS, Ubuntu, Unix, Windows, AIX
Version Control Tools: SVN, GIT, TFS, CVS and IBM Rational Clear Case
Web/Application Servers: Web Logic, Apache Tomcat, Web Sphere and JBOSS
Automation Tools: Jenkins/Hudson, Build Forge and Bamboo, Salt Stack, GitHub
Build Tools: Maven, Ant and MS Build
Configuration Tools: Chef, Puppet, Salt Stack and Ansible
SQL/NoSQL Databases: Oracle, Teradata, MongoDB, Dynamo DB
Bug Tracking Tools: JIRA, Remedy, HP Quality Center and IBM Clear Quest
Scripting: Shell, Ruby, Python and JavaScript
Virtualization Tools: Docker, VM virtual Box and VMware
Monitoring Tools: Nagios, Cloud watch, Splunk
Cloud Platform: AWS EC2, VPC, EBS, Cloud Formation AWS Config and S3
Languages: C/C++, Java, Python and PL/SQL
PROFESSIONAL EXPERIENCE:
ELK STACK SME
Confidential, Sunnyvale, CA
Responsibilities:
- Migrate Data from Elasticsearch-1.4.3 Cluster to Elasticsearch-5.6.4 using logstash, Kafka for all environments.
- Infrastructure design for the ELK Clusters.
- Elasticsearch and Logstash performance and configuration tuning.
- Identify and remedy any indexing issues, crawl errors, SEO penalties, etc.
- Provided design recommendations and thought leadership to improved review processes and resolved technical problems. working with product managers to architect the next generation of Workday search’s.
- Benchmark Elasticsearch-5.6.4 for the required scenarios.
- Using X-pack for monitoring, Security on Elasticsearch-5.6.4 cluster.
- Providing Global Search with Elasticsearch
- Written watcher alerts based on required scenarios.
- Developed APIs for integration with various data sources.
- Implemented cloud based integrations with elastic.
- Build visualization and Dashboards using kibana.
- Manage regular changes in priority due to customer priority changes.
- Written the Grok pattern in logstash.
- Configured logstash: input, filter, output plugins - database, jms, log file sources and elastic search as output converting search indexes to Elastic with large amount of data
- Elastic search experience and capacity planning and cluster maintenance. Continuously looks for ways to improve and sets a very high bar in terms of quality.
- Written custom plugins to enhance/customize open source code as needed. written automation salt Scripts for managing, expanding, and node replacement in large clusters.
- Sync Elasticsearch Data between the data centers using Kafka and logstash. managing Kafka Cluster and integrated Kafka with elastic
- Snapshot Elasticsearch Indices data and archive in the repository every 12 hours.
- Installed and Configure curator to delete indices older then 90 days.
- Maintain Elasticsearch-1.4.3 to support Sugar CRM Data Base.
- Separate Java URL’s Data from Elasticsearch-1.4.3 Cluster and transfer to Elasticsearch-5.6.4 cluster Using logstash, Kafka. working with development and QA teams to design Ingestion Pipelines, Integration APIs, and provide Elastic Search tuning/optimizing based on our application needs.
- Used Elasticsearch for powering not only Search but using ELK stack for logging and monitoring our systems end to end Using Beats.
- Responsible to designing and deploying new ELK clusters (Elasticsearch, logstash, Kibana, beats, Kafka, zookeeper etc.
- Proactively monitoring performance.
Environment: Elasticsearch, Logstash, Kibana, Curator, Xpack, Watcher, Zookeeper, Accumulo, Kafka.
ELK STACK / BIGDATA Engineer
Confidential, American Fork, UT
Responsibilities:
- Used ELK (Elasticsearch, Logstash and kibana) for name search pattern for a customer.
- Installed logstash rpm and started the logstash in our environment.
- Used elasticsearch for name pattern matching customizing to the requirement.
- Installed logstash-forwarder and run logstash-forwarder to push data
- Used Kibana plugin to visualize for elasticsearch.
- Created different dashboards based uon the level of user and this was integrated with the customer care support UI.
- Created Ambari Views for Tez, Hive and HDFS.
- Have done POC on Greenplum with Spark- Streaming
- Worked on Tableau dashboard
- Integrated sql with Tableau
- Applied business concepts to design and maintenance an internal database for Advanced Analytics group with MySQL and memSQL, including database backup, restoring, and optimization.
- Used ETL to extract files for the external vendors and coordinated that effort
- Used Change Data Capture (CDC) to simplify ETL in data warehouse applications
- Written Hive queries for data analysis to meet the Business requirements.
- Developed and executed shell scripts to automate the jobs
- Supported MapReduce Programs those are running on the cluster.
- Experienced in defining job flows.
- Develop Shell scripts to automate routine DBA tasks (i.e. database refresh, backups, monitoring) worked on a real-time data analytics project here at Confidential . The main aim of the project is to get real time information about the clients watching our product. To achieve this we used some of the most interesting technologies in big data like Kafka, ELK, Greenplum
- Develop ongoing test automation using Ansible, Python based framework
- Using Ansible to Setup/teardown of ELK stack (ElasticSearch, Logstash, Kibana)
- Troubleshooting any build issue with ELK and work towards the solution
- Work on scripting the automated solution of the platform monitoring with python and Ansible
- Working with the Kibana dashboard for the overall build status with drill down features.
- When a player watches Confidential lots of meta data gets generated. There is custom java code called collector which pulls this data and produces it into a Kafka topic called PlayerStats.
- This is only part of the data that we want. When the data comes to the PlayerStats topic, Kafka-Streams gets activated and based on a unique key called Subscribed guid in the data, it pulls other related information like ip, geo and all this kind of information and merges both these data sets.
- After this, the k-streams splits the data into different output topics called cdn, error,session,transition and so on based on the field called "type" in our Json.
- Basically, if the type is cdn k-streams will send that record to cdn output topic and so on... From here the data will be split into 3 forks we send the data to ELK to get the unique viewers, devices, bitrate and how many people are watching a particular channel in real time (Top Channels) etc.
- The other fork we send the data to is greenplum using a connector in Kafka called JDBC connector. We send the data to greenplum to generate reports using Tableau to give better insights to the business people on how to improve the business i mainly worked on ELK part of this project to send data from Kafka output topics to an elastic search index using Logstash
- Kibana hits this index and generates dashboards in real time to give us the information we need about our customers worked on developing salt scripts for Kafka and Elastic Search log stash. Salt is an automated deployment tool which we use to deploy Elastic Search on 100 nodes using a single command.
- Moved various formats like TSV, CSV etc. files from RDBMS to HDFS for further Processing.
- Gathered the business requirements by coordinating and communicating with business team.
- Prepared the documents for the mapping design and production support.
- Written the Apache PIG scripts to process the HDFS data and send the data to HBase.
- Implemented real time log analytics pipeline using Confluent Kafka, storm, elastic search, Logstash kibana, and greenplum.
- Maintaining the Elasticsearch cluster and Logstash nodes to process around 5TB of Data Daily from various sources like Kafka, kubernetes, etc.
- Created and Maintaining Real Time Dash boards in kibana for (Unique viewers, Unique Devices, Click Events, Clint Errors, Average Bitrate etc.
- Design, build and manage the ELK (ElasticSearch, Logstash, Kibana) cluster for centralized logging and search functionalities for the App.
- Responsible to designing and deploying new ELK clusters (Elasticsearch, logstash, Kibana, beats, Kafka, zookeeper etc.
- Written and Maintained Automated Salt scripts for Elasticsearch, Logstash, Kibana and Beats.
- Installation using Automated Salt Scripts in all the Environment from Dev to PROD.
- Responsible for setting up infrastructure for different environments (like Dev, QA, Pre-prod and Production)
- SPOC for all ELK related requests/issues.
- Participate in the development of strategic goals for leveraging big data.
- Install, configure, and maintain ELK stack systems.
- Work with engineering teams to optimize Elasticsearch data ingest and search.
- Collaborate with business intelligence teams to optimize visualizations.
- Architect horizontally scalable solutions (terabyte scale or larger).
- Elasticsearch and Logstash performance and configuration tuning.
- Respond to and resolve access and performance issues.
- Maintain change control and testing processes for all modifications and deployments.
- Conduct research and make recommendations on big data products, services, and standard Basics
- Participated in problem resolving, change, release, and event management for ELK stack.
AWS/Hadoop ELK
Confidential, SFO, CA
Responsibilities:
- Currently working as admin on Cloudera (CDH 5.5.2) distribution for 4 clusters ranges from POC to PROD.
- Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
- Worked with release management technologies such as Jenkins, GitHub, GitLab and Ansible
- Worked in DevOps model, Continuous Integration and Continuous Deployment (CICD), automated deployments using Jenkins and Ansible
- Designed architecture for ELK as Batch Layer.
- Send data to ELK using Elasticsearch-Hadoop plugin
- Write all the Grok filters in Logstash to trim the data
- Created Dashboard’s and Monitored Data in Kibana
- Installed/Configured/Maintained Apache Hadoop and Cloudera Hadoop clusters for application development and Hadoop tools like Hive, Pig, Hbase, Zookeeper and Sqoop.
- Managing and scheduling Jobs on Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions.
- Extensive experience in cluster planning, installing, configuring and administrating Hadoop cluster for major Hadoop distributions like Cloudera and Hortonworks.
- Installing, Upgrading and Managing Hadoop Cluster on Hortonworks
- Hands on experience using Cloudera and Hortonworks Hadoop Distributions.
- Have worked extensively on HIVE
- Expert in implementing the in-memory computing capabilities like Apache Spark written in Scala
- Processed Real time data using SPARK STREAMING.
- Used SPARK SQL to query directly on the Spark Streaming data
- Worked on the core and Spark SQL modules of Spark extensively.
- Used Kafka to transfer the data from different RDMS producers and greenplum was the consumer of the Kafka data.
- Data is loaded back to the Teradata for the BASEL reporting and for the business users to analyze and visualize the data using Data Meer. Real time streaming the data using Spark with Kafka.
- Data is loaded back to the Teradata for the BASEL reporting and for the business users to analyze and visualize the data using Data Meer. Real time streaming the data using Spark with Kafka.
- Worked with Kafka for the proof of concept for carrying out log processing on a distributed system.
- Installation of Teradata 15.x on AWS EC2 and used all associated AWS services for secure setup.
- Install and configure Analytic tool Spotfire to be used by data scientists.
- Installation, configuration and Operating System upgrade on Confidential Linux AS 6.x/7.x, Centos, Ubuntu hosted on dedicated hardware and EC2 instances in AWS.
- Setup and administered AWS services - S3 / EFS / EBS / IAM /Cloud watch /Cloud trail / Inspector / Trusted Advisor / Route 53 / RDS /SNS.
- Write custom code to and implement the CIS Critical Security Controls (CSC) for AWS environment’s cyber defense.
- Setup WordPress application on LEMP stack, running on Ubuntu OS in AWS environment.
- Design AWS Cloud Formation templates to create custom sized VPC, subnets, NAT to ensure successful deployment of Web applications and database templates.
- Design roles and groups for users and resources using AWS Identity Access Management (IAM).
- Have performed the plugging-in operation for Elasticsearch into Cascading flows.
- Developed multiple POCs using Spark-Streaming and deployed on the Yarn cluster, compared the performance of Spark, with Storm
- Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase and Sqoop.
- Installed Hadoop, MapReduce, HDFS, and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
- Coordinated with business customers to gather business requirements, also interacted with other technical peers to derive technical requirements and delivered the BRD and TDD documents.
- Extensively involved in Design phase and delivered Design documents.
- Involved in Testing and coordination with business in User testing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Involved in creating Hive tables loading data and writing hive queries that will run internally in MapReduce way.
- Experienced in defining job flows.
- Used Hive to analyze the partitioned and bucketed data to compute various metrics for reporting.
- Experienced in managing and reviewing the Hadoop log files.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data on to HDFS.
- Load and transform large sets of structured and semi structured data.
- Responsible to manage data coming from different sources.
- Involved in creating Hive Tables, loading data and writing Hive queries.
- Utilized Apache Hadoop environment by Cloudera.
- Created Data model for Hive tables.
- Involved in Unit testing and delivered Unit test plans and results documents.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Worked on Oozie workflow engine for job scheduling.
- Interacting with Cloudera support and log the issues in Cloudera portal and fixing them as per the recommendations.
- Imported logs from web servers with Flume to ingest the data into HDFS.
- Perform Deployment of Release to, various QA&UAT in Linux environments.
- Implement AWS solutions using EC2, S3, RDS, EBS, Elastic Load Balancer, Auto scaling groups, Optimized volumes and EC2 instances.
- ConfigureS3 versioning and lifecycle policies to and backup files and archive files in Glacier.
- Create monitors, alarms and notifications for EC2 hosts using Cloud Watch.
- Troubleshoot the build issue during the Jenkins build process.
- Setup various non-production environments for validating various applications.
- Write multiple Python, Ruby and Shell scripts for various application level tasks.
Environment: Java/J2ee, Subversion, Ant, Maven, Jenkins, GIT, SVN, Chef, Puppet, cloud watch, AWS (EC2, VPC, ELB, S3, RDS, Cloud Trail and Route 53), Python, Shell Scripting, Ruby, PUTTY, Confluence, SOA.
Linux Admin
Confidential
Responsibilities:
- Installed, configured and Administrated of all UNIX/LINUX servers, includes the design and selection of relevant hardware to Support the installation/upgrades of Confidential (5/6), CentOS 5/6, Ubuntu operating systems.
- Responsible for managing the Chef client nodes and upload the cookbooks to chef-server from Workstation
- Used Agile/scrum Environment and used Jenkins, GitHub for Continuous Integration and Deployment
- Installing Yum Repositories for HDP services including third party such as Ansible. Designed, automated the process of installation and configuration of secure DataStax Enterprise Cassandra using puppet.
- Installed and Configured HBase, Hive, Pig, Sqoop, Kafka, Oozie, Ansible, TLS and Flume on the HDP cluster.
- Made use of automation tools Ansible to push updates across the nodes.
- Configured Confidential Cluster Nodes for any legacy applications and verified the daily health check on the Cluster Nodes.
- I can install HDP stack and hadoop services, can set configurations on Ambari for each service Installing and updating packages using YUM (custom YUM servers/repositories) and Confidential Satellite Server.
- Configured and Administered Apache, VSFTPD services, MYSQL and Tomcat.
- Participating in 24x7 production on-call support of Linux and provided technical support to users.
- Implemented rapid provisioning and life cycle management for Redhat LINUX using kickstart.
- Expertise in security hardening (iptables/selinux) major Production Servers, and compiling, building and installing web server based Linux tools.
- Experience in performing, uploading and upgrading new firmware on the Interconnects and Chassis
- Implement and maintain internal systems key to DevOps operations such as database servers, continuous integration, and QA/Test servers
- Proficient in providing support on the deployed Confidential Enterprise Linux & Sun Solaris servers from both operating system level and application Level.
- Expertise in configuring Confidential Cluster Nodes for any legacy applications and verified the daily health check on the Cluster Nodes utilizing clusters.
- Expertise in creating VM Templates, cloning and managing Snapshots.
- Troubleshooting performance or configuration issues with MySQL and Oracle.
- Expertise in hardening, Linux Server and Compiling, Building and installing Apache Server from sources with minimum modules.
- Monitoring and troubleshoot backups and schedule Cron jobs.
- Experience in scripting using BASH & PERL, Mail Server, Samba Server, Apache Server.
- Worked on configuring NIS, NFS, DNS, DHCP, FTP, FSTP, Telnet and RAID levels.
- Experience in database replication using OCFS2 file system with oracle 10g and 11g database
- Experience in deploying several sets of Linux guest builds from VMware templates using PowerCLI as well as Confidential Satellite Server.
- Patch management of servers and maintaining server's environment in Development/QA/Staging /Production.
- Resolving assigned remedy tickets and remedy tools in Development/QA/Staging/Production.
Environment: RHEL 4.x/5/6, Solaris 9,10&11, HPUX, Centos, SUSE 10, 11, VERITAS Volume anager 3.x/ 4.x, VERITAS Storage Foundation 5, Redhat Cluster, VERITAS Cluster Server 4.1, Tripwire, NFS, DNS, SAN/NAS
Linux & Unix Admin
Confidential
Responsibilities:
- Responsible for handling the tickets raised by the end users which includes installation of packages, login issues, access issues
- User management like adding, modifying, deleting, grouping.
- Responsible for preventive maintenance of the servers on monthly basis. Configuration of the RAID for the servers.
- Resource management using the Disk quotas.
- Documenting the issues on daily basis to the resolution portal.
- Responsible for change management release scheduled by service providers.
- Generating the weekly and monthly reports for the tickets that worked on and sending report to the management.
- Managing Systems operations with final accountability for smooth installation, networking, and operation, troubleshooting of hardware and software in LINUX environment.
- Identifying operational needs of various departments and developing customized software to enhance System's productivity.
- Running LINUX SQUID Proxy server with access restrictions with ACLs and password.
- Established/implemented firewall rules, Validated rules with vulnerability scanning tools.
- Proactively detecting Computer Security violations, collecting evidence and presenting results to the management.
- Accomplished System/e-mail authentication using LDAP enterprise Database.
- Implemented a Database enabled Intranet web site using LINUX, Apache, MySQL Database backend.
- Installed Cent OS using Pre-Execution environment boot and Kick-start method on multiple servers.
- Monitoring System Metrics and logs for any problems.
- Running Cron-tab to back up Data.
- Applied Operating System updates, patches and configuration changes.
- Maintaining the MySQL server and Authentication to required users for Databases.
- Appropriately documented various Administrative & technical issues
Environment: Linux/Centos 4, 5, 6, Logical Volume Manager, VMware ESX 5.1/5.5, Apache and Tomcat Web Server, Oracle 11,12, Oracle Rac 12c, HPSM, HPSA.