Webops /site Reliability Engineer Resume
Park Ridge New, JerseY
SUMMARY:
- 11+ years of experience in the Server/Client architecture and Infrastructure Environment.
- 4+ years of experience in Application Performance Management
- 4+ Years of experience Cloud Technologies
- 3+ years of experience in Build/ Release with DevOps.
- 3+ years of experience bug/issue tracking tools
- 2+ year of experience in Jenkins, GitHub and Agile Methodology, ITIL.
- Provide operations support for java applications Hertz, Hertz247 and Confidential Car sales for all the domains and responsible for providing 24x7 production support.
- Experience in assisting business teams to establish policies, alerts, dashboards and custom configuration for visibility into the environment.
- Experience in Applications Technologies and Frameworks - Java, .NET, PHP, client-servers apps, etc
- Good experience on APM - Real End User Experience, Code Level Diagnostics, DB Diagnostics, Synthetic Monitoring
- Identify and solve critical problems and build automation to prevent their recurrence. Drive blameless post-mortems with the product team and use the Error Budget to establish priorities for any necessary changes.
- Experience in analyzing Code Profilers, Java Debuggers, CPU Utilization, Memory usage, Garbage Collection and JMX Measures to verify the performance of the Application using Dynatrace.
- Working experience with content distribution team to improve performance of website (Akamai and Imperva).
- Trouble-shoot problems that span systems, databases, storage, network (TCP/IP), and code
- Hands on experience in leveraging CI and CD tools - Jenkins to automate testing and deployment.
- Experience on monitor and tune the performance of the infrastructure with Dynatrace Oneagent
- Experience in installing and configuration with Dynatrace
- Implement and enforce best practices for code promotion across the various environments (builds, approvals, release). Strong SDLC background with Scaled Agile practices
- Experience in configuring, installing, maintain of DevOps Hygieia Dashboard tool.
- Experience in defining the service level indicators (SLIs), objectives (SLOs), and agreements (SLAs)
- Good experience in software development technologies - design patterns, application servers, RDBMS cloud computing, micro services architecture, APIs, web-services etc.
- Experience with container management technology required with Docker
- Good Experience with Linux and Linux environments (RHEL 6/7)
- Expertise in Deployment and Operation of Application Performance Management using Compuware Tools - Dynatrace, Gomez APMaaS and Data Center - Real User Monitoring
- Experience assisting business teams to establish policies, alerts, dashboards and custom configuration for visibility into the environment
- Experience application performance monitoring and logging tools like Dynatrace, AppDynamics.
- Working closely with our Developers, Engineers, Product and Site Managers and other members of the DevOps team to meet the needs of the organization to stay competitive from the infrastructure up to the highest level of applications
TECHNICAL SKILLS:
Cloud Technologies: Google Cloud, Amazon Web Services
Automation/OrchestrationTools: Jenkins, Docker, Kubernetes
Operating/Ticketing System: Windows, RHEL, BMC Remedy and Service Now
Monitoring Tools: Splunk, Dynatrace, Gomez, Nagios and AppDynamics
Database: SQL Server 2005/2008 R2
Web Servers: Apache Tomcat
Methodologies: S.A.F.E Agile and ITIL
Tracking Tools: JIRA, HP ALM
Languages/ Scripting: Shell, Python
Repository Tools: GitHub, Bitbucket and Nexus
PROFESSIONAL EXPERIENCE:
Confidential, Park Ridge, New Jersey
WebOps /Site Reliability Engineer
Responsibilities:
- Involved in requirement gathering and implementing the new case process redesign and enforcing best practices like refactoring the existing code base and providing inputs.
- Analyze the CPU Utilization, Memory usage, Garbage Collection and JMX Measures to verify the performance of the Application.
- Worked with content distribution team to improve performance of website (Akamai).
- Research application logs for outage duration, provide application logs to Dev architects and prepared report on Root Cause Analysis.
- Part of a team that manages APM & UEM solutions including configuration and infrastructure support of monitoring/alerting/reporting tools
- To identify and manage asset reliability risks that could adversely affect business operations
- Part of the change management process is making sure that changes and any new services that will be deployed
- Makes technical recommendations on application performance improvement initiatives and other monitoring
- Identify root causes for performance bottlenecks in web applications and provide solutions.
- Establish Error Budgets for the products by monitoring SLIs, measuring SLOs and publishing them to a dashboard
- Partner with application development, application support, and other IT infrastructure resources to define measurement frameworks and performance dashboards
- Works closely with application development teams during release rollout to analyze application performance impact and troubleshoot issues
- Provide long-term performance improvement recommendations.
- Recommend solutions to resolve complex performance issues (response time, throughput, etc.).
- Provide input and insight on software design, architecture and build process.
- Develop technical proofs of concept to test and validate performance optimizations.
- Work with cross-functional project teams to define performance metrics and acceptance criteria.
- Performs installation and upgrade of Dynatrace and other monitoring tools.
- Proficient in platform internals, advanced debugging, and root cause analysis
- Knowledge of multiple scripting languages, synthetic transaction coding, dashboard creation,
- Knowledge of multiple specific platforms and the technology to support them (i.e., Windows, Linux)
- Previous experience with Splunk, ELK, Logstash and Kibana
- Experience designing or architecting large-scale web / e-commerce applications a plus.
- Worked in an Agile environment. Participated in Scrum meetings and updating Rally regarding the tasks and time spent on each tasks.
Confidential, Chicago, Illinois
DevOps Engineer
Responsibilities:
- Played a role of Devops Engineer and have been responsible for pushing the artifacts from SIT to Pre-Prod regions.
- Written groovy scripts in Jenkins Jobs to automate sanity checks
- Configured Jenkins Pipeline Jobs to automate continuous deployment which replaces the manual triggering of deployment jobs.
- Involved in requirement gathering and implementing the new case process redesign and enforcing best practices like refactoring the existing code base and providing inputs.
- Analyze the CPU Utilization, Memory usage, Garbage Collection and JMX Measures to verify the performance of the Application.
- Monitor the Third Party Web Services for APM
- Create dashlets, dashboards in AppDynamics tool for ongoing applications.
- Agent installation and Instrumentation for different KPI’s. instrumenting UEM for Web applications (Expertise in handling URL transitions so that no page hits will get missed out of UEM).
- Experience designing, configuring, and implementing complex AppDynamics deep diagnostics solutions (Load Tester, Insights, AppMon) across complex Java enterprise platforms.
- Written linux scripts for deployment and integrated them in the deployment pipeline
- Used Nexus repository for artifacts management
- Implemented a Spring Boot application called Hygieia(Devops Dashboard) for eWallets project members which helps to have a consolidated view of the entire CI/CD pipeline
- Integrated CI tools like Jenkins, Sonar, Github etc within Hygieia dashboard
- Configured the dashboard's pipeline view which is unique for each scrum team. Modified couple of controller classes to achieve the same.
- Work on card ewallet services.
- Used to have Build tools like Gradle and Maven with Jenkins job.
- Handling Java development including design & troubleshooting of applications, conducting gap analysis including validation of needs in conjunction with onsite a& offshore teams.
- Implemented a Continuous Delivery pipeline with Docker, Jenkins and GitHub and AWS AMI’s, whenever a new github branch gets started, Jenkins, CI server, automatically attempts to build a new Docker container from it,
- Converted staging and Production environment from a handful AMI’s to a single bare metal host running Docker.
- Created and deployed Jenkins pipeline jobs for deploying the builds in multiple environments
- Creating scripts to do automation jobs including health check jobs
- Creating automation plans for different environment include UAT, External QA and Pre-Prod.
- Regular deployments and monitoring
- Jenkins jobs creation and deployments and rollbacks.
- Plan and work according sprints and used complete the PI with in the time.
- Configure widgets for Hygieia dashboard
- Build/Support development infrastructure and manage environments
- Assist with test automation strategies and adoption.
- Provide and support full system integration
- Stage and support system sprint demo
- Setup and attached EBS volumes to Ec2 instances
- Code integration strategy and plan
- Utilized Cloud Watch to monitor resources such as EC2, CPU memory, EBS volumes to set alarms for notification or automated actions and to monitor logs for a better understanding and operation of the system.
- Monitored alarms and thresholds; develop new monitoring requirements and automated recovery procedures for monitored conditions with scripts and workflows.
Environment: AWS Cloud, Unix, GIT, Tomcat, Jenkins, Splunk, Windows and RHEL 6.0,7.0, Workflow & Approvals, ITSM remedy, Network Protocols, SQL Database and NGINX, Proxy.
Confidential
Cloud Operations Analyst
Responsibilities:
- Involved in requirement gathering and implementing the new case process redesign and enforcing best practices like refactoring the existing code base and providing inputs.
- Maintained and administered GIT source code tool.
- Work on query execution by managing parameters and monitoring performance of the database in a client-server environment.
- Work on performance by new indexing and data retrieval or storage methods.
- Configuration management and orchestration tools for system and application configuration
- Created Branches, Labels and performed Merges in Stash and GIT.
- Developed processes, tools, automation for Jenkins based software forbuildsystem and delivering SW Builds.
- Splunk and Cloud Watch in the Amazon Web Services (AWS) environment.
- Managed build results in Jenkins and deployed using workflows.
- Worked on Docker container snapshots, attaching to a running container, removing images, managing directory structures and managing containers.
- Establish efficient operational and escalation procedures
- Extend and improve hardware and network infrastructure availability by utilizing proven hosting technologies and generating plans for future capacity and growth
- Using ITIL process while pushing builds and deployments to prod and pre-prod environments.
- Familiar and experienced with Agile Scrum development.
- Participate in 24x7 on-call incident escalation rotations
- Execute and maintain internal and external SLAs developed with business stakeholders
Environment: AWS Cloud, Unix, GIT, Tomcat, Jenkins, SAN, Splunk,Virtualization, Windows and Linux Operating Systems, Workflow & Approvals, ITSM remedy, Reports, Network Protocols, SQL Database and Monitoring Tools.
Confidential
IT Operations Analyst
Responsibilities:
- Involved in requirement gathering and implementing the new case process redesign and enforcing best practices like refactoring the existing code base and providing inputs.
- Responsible for day-to-day management of all Development, Test, Stage, and Production service/application infrastructure
- Configuration management and orchestration tools for system and application configuration
- Estimating AWS usage costs and identifying operational cost control mechanisms
- Experience building large infrastructure for disaster recovery and multi data center strategy. managed the source code for various applications in SVN and GIT
- Experience building large infrastructure for disaster recovery and multi data center strategy.
- Setup various Jenkins jobs for build and test automation and created Branches, Labels and performed Merges in Stash and GIT
- Managed the source control using version controlling tools like SVN and GIT
- Implemented Infrastructure automation through Jenkins, for auto provisioning, code deployments, software installation and configuration updates.
- Working closely with Development Managers to fine tune the release process and provide feedback on process improvements.
- Strong Project Management experience performing ITIL RM /SCMactivities.
- Grew DevOps organization year-over-year both form numbers of applications supported and from culture and influencing the direction of the whole applications development teams.
- Knowledge of agile development methodologies like Scrum, Sprints model.
Environment: Jenkins, SAN, Virtualization, Windows, Service Now, Nagios, Reports, Network Protocols, SQL Database and Monitoring Tools.
Confidential
System Engineer
Responsibilities:
- Interacted with various business team members to gather and documented the requirements.
- Level 2 Support for Applications Running on Production servers and ensure high availability to internal and external users/clients by responding to application related issues or requests.
- Providing production critical Application support for banking application (coded in Java) which runs on Solaris, AIX and Windows servers.
- Checking system and application logs for errors in various services and to troubleshoot the same.
- Monitoring the server for CPU usage, Memory usage, Network usage and disk utilization using netstat utilities.
- Running SQL update queries to fix production issues.
- Interacting with Application developers while working on a production fix.
- Testing new applications that are in pipeline which will replace the existing applications and to create documents and establish a process to do future troubleshooting.
- Participate in Enterprise Change Management process (ECM) to install/upgrade patches for different applications running on production and test servers.
Environment: Windows and Linux Operating Systems SQL 2000/2005 Database Servers, PoP3 Email Services, Network Protocols, Web Servers
Confidential
System Administrator
Responsibilities:
- Checking the maintenance plans/jobs on the server for which the Ticket raised on the Onguard. For the failed job we rectify the cause of the failure from various logs (Windows, SQL server) and fix the problem.
- Regular daily DBA admin tasks include managing servers, security, monitoring connections, Monitoring Scheduled Task/Jobs.
- Installing SQL server 2005 and Service packs.
- Checkout DBCC output, Check Database and Transactional space, Check Event Logs and SQL Server Error Logs.
- Troubleshooting performance issues and fine-tuning queries and stored procedures.
- Logical Backup and Recovery using export/import, BCP from one instance to another.
- Plan, design and maintain a database; to move a database or database file to another server or disk using attach and detach. If databases or transaction logs are larger, than necessary shrink those files to free up disk space.
- Monitoring SQL Server for Performance. Like Monitoring Memory Use, Monitoring Physical Disk.
- Worked on performing database consistency checks (DBCC).
- Work closely with project managers, programmers and Financial Analyst and other team member; communicate regularly with technical, applications, and operational staff, to ensure the database integrity and security.