This conference talk explores how Netflix manages its complex architecture to handle significant demand shifts across its global streaming platform. Learn about the four-region full-active architecture of Netflix's streaming control plane and the techniques used to shape and prioritize traffic during load fluctuations. Discover how Netflix balances (and sometimes intentionally unbalances) load, implements partial or complete failovers, and uses traffic shifting to mitigate demand changes. Examine their approach to capacity management through intelligent pre-scaling, automated service buffer management, load shedding, and rapid autoscaling. The presentation covers the difficult tradeoffs between system stability and user experience, explaining how Netflix smartly degrades service while maintaining the highest possible quality of experience. The talk concludes with insights into Netflix's underlying data architecture, including resilience techniques for stateful systems such as data gateways, capacity planning, sharding, and strategic caching implementations.
Overview
Syllabus
SREcon25 Americas - Techniques Netflix Uses to Weather Significant Demand Shifts
Taught by
USENIX