This course aims to teach learners about implementing an effective symptom-based alerting strategy in complex distributed systems. The course covers the concept of Adaptive Paging, which leverages causality from tracing and semantic conventions to page the team closest to the problem. By applying heuristics to identify the most probable cause of alerts, this approach helps in reducing alert fatigue and enables teams to focus on end-user pain points. The intended audience for this course includes professionals working with distributed systems and alert management in organizations.
Overview
Syllabus
SREcon23 Asia/Pacific - Are We All on the Same Page? Let's Fix That
Taught by
USENIX