IT - Technical Test Lead | Performance Testing | Performance Testing - ALL - Hire IT People

Job Seekers, Please send resumes to resumes@hireitpeople.com

Detailed Job Description:

Troubleshoot mission critical full stack applications, microservices, infrastructure and legacy business applications/websites performance and availability issues
Work with DevOps Architects to implement fault tolerance, back-up, and disaster recovery solutions.
Lead root cause analysis/investigations through identifying, analyzing and remediating service(s) performance and availability issues to ensure maximum service uptime and availability
Pre-emptively pursue the discovery of system faults throughout the application lifecycle before and after release.
Manage the incident lifecycle to resolutions and conducting Blameless Post Incident Review
Working with the QA Lead to establish best practices for measuring and monitoring availability, latency and overall system health. You re expected to be on- call and have strong written communication skills and be able to develop working relationships with coworkers
Experience in balancing service reliability, metrics, sustainability, technical debt, and operational toil for live services running at scale
Implementing concepts in Chaos Engineering like Simeon Armies.
Work across multiple project teams simultaneously to support rapid development efforts
Solve complex, business critical issues that impact bottom line financial numbers and customer loyalty/experience
Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
Contribute positively to open source projects developed and join existing communities
Bring experience, pragmatism, empathy, and composure to interactions with teams outside of the RE organization
Work frequently with Product teams on shared goals and cross-team projects
Balance planned and reactive work using basic project planning techniques and technical roadmaps
Work and collaborate across teams such Application services, Capacity Planning, Hardware, Network, and Datacenter Operations
Participate in building advanced tooling for testing, monitoring, administration, and operations of multiple clusters across multiple environments
Experience negotiating SLIs, SLOs, and SLAs with product owners

General/Minimum Qualifications:

3-5+ years of applying reliability and chaos engineering principles with distributed cloud services
Strong knowledge of and comfortability with GNU/Linux and Windows operating system(s)
Proficiency in high-level languages such as Ruby, Python, Powershell, and Bash
Exposure to system-level languages such as Go, C/C++/C#
Familiarity with configuration management software such as Puppet, Chef, Ansible, or Salt
Source control, branching, & merging, packaging (git, GitHub, NuGet, npm)
Networking basics: TCP vs UDP, basic troubleshooting, HTTP load balancing, firewall, private networks, multi-tier design, scale-out
Databases RDBMS, NoSQL, SQL, analytics, persistent data
Familiarity with standard infrastructure concepts like load balancers, firewalls, object storage and where/when they might be used
Service Management Incident Response, Change, and Problem Management.
Experience with Kubernetes, Docker, Helm, and Virtual Machines
Cloud computing concepts (one or more public cloud providers) VMs vs Docker Containers, block storage vs object storage, infra automation vs install automation
Experience operating a platform, software as a service, or shipping software
Experience as an open-source contributor

IT - Technical Test Lead | Performance Testing | Performance Testing - ALL

Client Services

Job Seekers

Visa Sponsorship