This conference talk from SREcon25 Americas explores how to effectively scale incident learning processes within organizations. Vanessa Huerta Granda from Enova shares her team's approach to cross-incident analysis, moving beyond individual incident investigations to analyze patterns across multiple incidents. Learn how her resiliency engineering team has structured their incident review program to examine incidents across different timeframes (quarters, years), products, and technologies to gain valuable insights. Discover practical methods for implementing scalable learning from incidents and how these insights can lead to meaningful improvements in sociotechnical systems. The presentation addresses the challenges of scaling in-depth incident investigations while maintaining their value for organizational learning.
Overview
Syllabus
SREcon25 Americas - Learning from Incidents at Scale; Actually Doing Cross-Incident Analysis
Taught by
USENIX