This 15-minute conference talk from SREcon25 Americas explores how Netflix's Content Delivery Network SREs tackle the challenge of automatically detecting broken gameplay sessions and game-breaking issues in cloud gaming environments. Learn how the team leverages statistics and machine learning techniques to process massive volumes of daily logs and sessions at scale. Discover the various metrics used to infer brokenness, accessible methods for vectorizing and clustering exception messages, and statistical approaches to identify broken sessions, detect game-breaking issues, and confidently assess their impact. Presented by Ian Neidel from Netflix, this talk offers valuable insights for SREs expanding beyond traditional content delivery metrics like latency, bitrate, and dropped packets into ensuring quality gameplay experiences.
Overview
Syllabus
SREcon25 Americas - Using Statistical Techniques to Automatically Detect Game-Breaking Issues
Taught by
USENIX