Skip to main content

This job has expired

Senior Resilience Engineer, Incident Management

Employer
Procore
Location
Carpinteria, California
Closing date
Oct 25, 2021

View more

Position Type
Other
Hours
Full Time
Organization Type
Business
We're looking for a Senior Resilience Engineer to join Procore's Incident Management and Resilience Engineering (IMRE) team within the Cloud Platform Engineering department. In this role, you'll work directly with Engineering, Customer Success, and Product teams to help them better understand their technology, people, processes, and organization through the lens of incidents. You'll drive the adoption of modern reliability practices like Service Level Objectives (SLOs), error budgets, fault-tolerant design patterns, incident retrospectives, chaos testing, and end-to-end ownership. If you're interested in an exciting opportunity to have a significant impact on our internal systems-join us!

This position will report to the Manager, Cloud Platform Incident Management, and can be located in our Carpinteria, CA headquarters, New York City, or Austin, TX office. Remote candidates will be considered based on experience with the expectation of occasional travel to these offices. We're looking for someone to join our team immediately.

What you will do:

-Own Procore's full incident response lifecycle, from defining and updating processes to coaching teams on incident management

-Evolve Procore toward a "learn and adapt" approach

-Drive post-incident investigations and analysis by conducting interviews, identifying contributing factors, and reviewing incident response

-Lead initiatives that focus on process improvements and improving customer experience

-Identify and promote the disciplines that help Procore evolve as a learning organization

-Stay up to date on topics like cognitive systems engineering, safety science, resilience engineering, UX research, human-computer interaction, organizational psychology, or cultural anthropology

What we are looking for:

-BS or MS degree; Technical Certifications are a plus

-5+ years of combined experience as a Software Engineer and DevOps Engineer, with coding knowledge in an object-oriented language

-Strong experience documenting and driving process improvements

-Experience in engineering or operations, specifically participating in incidents

-Experience with version control systems, CI/CD, distributed applications, and service-oriented architectures

-Strong technical writing skills, code literacy, and cross-functional communication skills

-Experience working with observability teams and operations teams is preferred

Get job alerts

Create a job alert and receive personalized job recommendations straight to your inbox.

Create alert