Serving DNNs like Clockwork - Performance Predictability from the Bottom Up

Overview

This course aims to teach learners how to achieve performance predictability in model serving systems for Deep Neural Networks (DNNs). The course covers the following learning outcomes and goals: understanding the importance of low latency in model serving systems, recognizing the predictability of execution times in DNN inferences, learning a principled design methodology for building a distributed model serving system, and achieving predictable end-to-end performance. The individual skills or tools taught include designing and implementing a system like Clockwork to support multiple models while meeting latency targets and request-level service-level objectives. The teaching method involves presenting concepts through a structured syllabus including topics like High Tail Latencies, Predictable Worker, and Clockwork. The intended audience for this course includes professionals working with machine learning inference, model serving systems, or interested in improving performance predictability in DNN applications.

Syllabus

Introduction
High Tail Latencies
Predictable Worker
Clockwork
Clockwork Example
Conclusion

Taught by

USENIX

Reviews

Start your review of Serving DNNs like Clockwork - Performance Predictability from the Bottom Up

BloomTech’s Downfall: A Long Time Coming

Most common

Popular subjects

Popular courses

Serving DNNs like Clockwork - Performance Predictability from the Bottom Up

Overview

Syllabus

Taught by

Reviews

BloomTech’s Downfall: A Long Time Coming

Taught by

SHEPHERD - Serving DNNs in the Wild

SparTA - Deep-Learning Model Sparsity via Tensor-with-Sparsity-Attribute

Never Stop Learning.