Talk details

How To Measure MTTx Values in SRE
Topics:
Software Delivery Craft Matters
Level: Beginner+

Measuring the Return on Investment of SRE efforts seems like a straightforward exercise, and many tools for incident management support some view of this data. However, a lot of important data can be masked in those views. In this talk, I will discuss how to measure Mean Time To Fail, Mean Time To Detect, and Mean Time To Repair/Restore to have clarity about whether your team is improving over time. I will also discuss how to track these values by core business experience, not just by individual services.

Speaker
Craft 2024 - Jamie Allen
Jamie Allen
CTO, AWS and SRE at EPAM Systems

Jamie Allen is EPAM's Chief Technologist for AWS and Site Reliability based in Seattle, USA. Prior to joining EPAM in 2019, Jamie was the global head of consulting and training for Typesafe/Lightbend (creators of Scala and Akka), and the Director of Engineering at Starbucks responsible for reimplementing Starbucks Rewards and Mobile Order/Pay backend systems from scratch in the cloud with microser...