We provide IT Staff Augmentation Services!

Site Reliability Engineer Resume

4.00/5 (Submit Your Rating)

SUMMARY

  • Solutions - oriented IT Systems/Software Engineer with excellent success in implementing a broad range of IT initiatives.
  • Key player in organizational change with excellent experience in translating business requirements into processes, designs and solutions.
  • Strong background in applying Site-Reliability Engineering concepts and DevOps methodologies to production application to meet organization’s SLO and SLI objectives.
  • Strong background in Identity Access Management. Excellent cloud Infrastructure knowledge.
  • Over twenty years of success in meeting critical software engineering challenges.
  • Excellent hands on development skills using Java, C/C++, Python programming languages.

TECHNICAL SKILLS

O/S: UNIX (AIX), Linux (RHEL / Centos / Ubuntu), Windows 2008R2 / 2012

Programming Languages / tools: Java/Spring/Microservices, JEE6, Python, C/C++

DevOps/SRE: CI/CD (Jenkins), GCP(Kubernetes, GCS, Cloud Functions, Dataflow, Istio), Terraform, Git/Bitbucket, Splunk, Maven, Jira, Confluence, AppDynamics, Apica, Kiali, Grafana, Datadog, Jaeger, Prometheus, TwistLock, SIEM, Ansible/Chef/Puppet.

Scripting: Shell (Ksh, Bash), Python/Jython, WLST, Windows PowerShell

Application/Web Server: Oracle WebLogic Server (12c/11/10), Tomcat

Security tools: CyberArk (CPM, PVWA, PSM, AIM), IAM(Aveksa, AWS),WhiteHat (DAST, SAST)

Database: Oracle 12, MySQL, SQL-Server 6.5, MS-Access

Cloud: GCP, AWS

PROFESSIONAL EXPERIENCE

Confidential

Site Reliability Engineer

Environment: AWS (EKS, Kubernetes), Monitoring (Prometheus, AlertManager), Helm, EFK

Responsibilities:

  • Automate provisioning through Terraform and Kubernetes, Maintain the reliability, scalability, availability, and security of our platform and applications.
  • Automated Configuring of AWS EKS cluster using terraform scripts. Monitored application and services deployed to the cluster using Promerthes operator and sent alerts to slack channel through alert manager service.
  • Centralized log collection for kubernetes cluster and application components using EFK stack. Developed Kibana dashboards for visuallization of metrics collected via node exporter and other application metrics.

Confidential

Site Reliability Engineer

Environment: GCP (Kubernetes/K8s, Terraform, GCS, Cloud Functions, Dataflow, BigQuery, BigTable), DevOps (Git, Bitbucket, Jira, Confluence, Jenkins), Splunk, Rapid7, AppDynamics, Datadog, Elasticsearch, Java / microservices, Python, Groovy, Selenium.

Responsibilities:

  • Implemented SDLC process from inception, design, deployment, operations and refinement through the team process. Successfully deployed and monitored java spring based microservices application to Google cloud.
  • Implemented CI/CD process using Jenkins, Terraform and GKE on Google GCP shared VPC environment. Compiled microservices maven builds using Jenkins CI process and deployed using docker and Kubernetes to GCP environment namespace such as dev, QA, UAT and PROD. Configured and automated infrastructure component such as GCP’s Cloud Functions, Dataflow, GKE Cluster, GCS, BigQuery, Pub/Sub using terraform deployed via CI/CD pipeline. Performed functional and regression test of microservices web application using Selenium tool and integrated with Jenkins CI/CD pipeline.
  • Performed troubleshooting procedures to identify root cause of CI and CD issues in production. Used kubectl tool to identify bottlenecks in Kubernetes components deployment in production and took remedial measures to resolve the issue.
  • Developed and maintained Jenkins shared libraries using groovy for automating all phases of CI/CD process such as configuring, testing and deploying to GCR registry. Developed python scripts to automate repeated tasks such as build process and gcloud components installs.
  • Remediated vulnerabilities in GCE and GKE instances using Rapid 7 and twistlock monitoring tools. Remediated security, IAM and other compliance issues using C3M cloud security posture and compliance assurance platform.
  • Monitored systems for availability, latency and health using AppDynamics and Datadog monitoring tools. Used Prometheus monitoring tool to monitor Kubernetes cluster and sent alerts to PagerDuty for on-call team to troubleshoot issues. Used Apica Synthetics to proactively monitor application for availability and performance.
  • Implemented change management process and handled production Incidents troubleshooting and its remediation. Managed systems for reliability, latency and availability by measuring SLI and took necessary measures to meet the SLO expectations of the application.
  • Implemented security controls based on data classification for all deployed GCP components using Customer managed Keys/KMS Encryption.
  • Mentored junior staff and offshore team on various Google cloud implementations, Jenkins CI/CD and other day to day activities

Confidential

Consultant

Environment: RSA Identity Access Governance and Lifecycle(Aveksa), CyberArk, Oracle 12i, DevOps, Python, PowerShell

Responsibilities:

  • Provided Identity Access Solutions to internal and external customers of Confidential using RSA Identity Access and Management Lifecycle product (formerly Aveksa). Configured Account, role and entitlement collector using shell, python and PowerShell scripts. Gathered information on existing resources such as AD, Oracle, Postgres, Mainframe using collector technologies and configured connectors to provision access on endpoints. Integrated Active Directory with IAM solution. Created/modified Workflows, Forms, Rules, Roles and Workflow Sub Process. Provided solution to various access management use cases for out of the box functionality by creating custom solutions and scripts. Worked with vendor (RSA) on critical product issues. Configured CyberArk Vault, Unix account, safes and modified default policy to authenticate to Unix server via PSMP jump server.
  • Implemented DevOps methodologies for automating provisioning of resources on end servers/application for users of Confidential internal and external client. Provided access to accounts such as Oracle/Postgres Database, Unix Server and Active Directory using Jenkins, and Rundeck. Configured Git Repositories. Developed python, Unix and PowerShell shell scripts to provision access on the server/application end point by automating this via Jenkins CI/CD pipeline. Used Chef methodologies to configure server and applications. Configured recipes, templates and cookbooks to manage server and application configurations.

Confidential

Senior Systems Engineer

Responsibilities:

  • Configured CyberArk Privileged Identity Management Solution to onboard and manage Windows Domain Controller accounts, Unix root/user accounts and database system accounts into Cyber-Ark Vault. Created CyberArk components such as safes, accounts, platform, and application ID for authentication using PVWA. Modified master policy to enforce check-in/check-out exclusive access and to force password rotation using after each use for critical accounts. Configured WebLogic and Tomcat application servers to pull passwords stored in CyberArk vault using CyberArk Application Identity Manager (AIM) product. Created CLI scripts to pull password using CyberArk API. Performed automatic failover and failback of vault services between primary and DR vaults. Established standards and procedures for implementation of CyberArk PIM solution.
  • Configured WhiteHat Security’s DAST and SAST tools to mitigate vulnerabilities in source code repositories and dynamic application running in production environment. Onboarded applications for static code and Dynamic site testing by WhiteHat scanner. Collaborated with application team and WhiteHat security support addressing issues in scanning and several false positives in scan results. Contributed to product development and POC testing to bring in additional languages for scanning.
  • Implemented and customized Enterprise Holding’s AWS private cloud environment using Amazon’s EC2 and S3 modules. Configured puppet and python scripts to automate deployment, changes to tomcat, apache and Splunk instances. Configured auto scaling group for applications to dynamically scale web and application instances. Applied changes to base AMI Configuration of Apache, tomcat, Splunk and RHEL Linux and generated new AMI for use with application teams. Used GIT and GITHUB private repositories to post code/script changes and merge branches to master branch for application team use.
  • Designed and architected WebLogic components for scalability and performance under clustered multi-tier environment. Designed WebLogic components for failover between server using distributed JMS, session replication strategies and multi-pooling strategies across multiple data centers. Provided architectural recommendation for automatic failover of application / components across data centers based on database availability. Provide infrastructure solutions for distributed computing needs using SOA.
  • Successfully completed several J2EE projects using WebLogic 12c/11/10, Tomcat on AIX and Linux environment encompassing all aspects of project life cycle including planning, design and implementation. Provided architectural recommendations for improving performance, scalability and failover of J2EE components. Performed capacity analysis of application based on current traffic and future growth, recommended appropriate configurations such as memory, server instance to accommodate current and future traffic. Benchmarked applications from an infrastructure/middleware standpoint with production use cases for optimum performance.
  • Determined root cause analysis of failing J2EE components in production and benchmark environment and implemented corrective measures to rectify issues. Applied troubleshooting methodologies and failure patterns in production environment to resolve application issues. Fine-tuned JVM and recommended optimum heap and garbage collection JVM options for better performance. Configured WebLogic JDBC connection pool for optimum performance.
  • Configured WebLogic Diagnostic Framework, collected JEE performance metrics information from running JVM to analyze and identify bottlenecks in the application.
  • Developed and documented standards for administering, scripting and deployment of applications. Developed script-based solution for automatic failover of application instances for an active/passive DB cluster.
  • Developed Scripts/Utilities for basic administration and automation tasks using Python, Jython, WLST and UNIX shell scripting languages such as start/stop of WebLogic domains, deployment, log rotation and performance metrics from running webLogic/tommcat domains.
  • Developed velocity templates to automate generation of WLST/python scripts from application configuration files for deployment to various environments.
  • Developed internal Web sites for monitoring server performance, displaying server configuration using HTML/JQuery for presentation logic and java/jsp/servlets/jdbc for backend processing.
  • Provided subject matter expert services for securing / encrypting WebLogic and Tomcat applications using Java keytool / OpenSSL including generating certificate, signing and importing into keystores.

Confidential

Consultant

Environment: J2EE, WebLogic 6.1, XML Spy, Xerces API.

Responsibilities:

  • Developed Entity beans and session beans to support various business processes to access rail car equipment information through Data Access Objects (DAO). Used Xerces XML Parser to create XML document from DAO result sets. Developed Message driven beans to process messages via JMS. Used log4j for logging, MVC design patterns, struts framework to design application flow and JSP for presentation logic. Tested the application for various business scenarios and production load.
  • Developed real-time Rail Car Events data using Oracle Database TCS (Mainframe) to enable migration of Inquiry of critical TCS Data away from TCS mainframe to Next-Generation Oracle Database. The database created was used to populate web-based applications such as chargeable events, Demurrage with car events data for billing and reporting purposes. The business process was built using C++ using Object Oriented technology to ensure scalability. Used Rogue Wave tools++ value and pointer-based collection classes for efficient data manipulation and to cache the data in memory for repeated use. Used HP-UX WDB debugger.
  • Developed Tuxedo based services for business process tier and data management tier. Used built in MSGAPI functionality for handling request, response, and transaction processes. Used MSGAPIStruct as a base class for messages that are passed using the MSGAPI.
  • Developed pro*c programs to insert, update and delete car events from the database. These pro*c programs were integrated as a tuxedo service for the Data Management tier. Improved performance of data inserts and updates using host arrays.

Confidential

Senior Software Engineer

Responsibilities:

  • Completed several E-commerce initiatives using BroadVision, ATG Dynamo and Iplanet server such as Coke E-Portal platform, Archway Bank, MySkinMD.com, HomeDepot.com and Ballarddesigns.com.
  • Configured BroadVision Dynamic Command Center (DCC) to serve dynamic content such as product prices, editorial content and health news based on user criteria. Configured Epi-centric portal server to manage and control web content. Provided options for clients to configure portlets to suit their profile. Developed servlets for posting data from forms and to server dynamic content based on database query. Developed JSP pages to render the dynamic content on to HTML pages. Developed servlets to post user input to a back end PRO*C bank application using BEA Jolt Interface via Tuxedo middle layer.
  • Configured and Administered Application servers such as BroadVision, ATG Dynamo, IPlanet application server for scalability and high availability. Configured WebServers such as IIS, Apache and IPlanet to server static content and connectivity to app servers.

We'd love your feedback!