Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads

USENIX via YouTube

Overview

The course focuses on teaching heterogeneity-aware cluster scheduling policies for deep learning workloads. The learning outcomes include understanding how to allocate heterogeneous resources, addressing challenges related to performance variations, and optimizing scheduling objectives. The course covers topics such as expressing policies as optimization problems, utilizing effective throughput, and implementing round-based scheduling mechanisms. The intended audience for this course includes individuals interested in cluster scheduling, deep learning workloads, and optimizing resource allocation in heterogeneous environments.

Syllabus

Intro
Hardware for ML training is becoming highly specialized and heterogeneous!
How should we allocate heterogeneous resources?
Challenge 1: Heterogeneous performance
Challenge 2: Diverse scheduling objectives
Related work
Gavel: A new heterogeneity-aware cluster scheduler
Scheduling policies to be made heterogeneity-aware
Policies as optimization problems
Allocations (x) as time fractions
Effective throughput
Performance optimizations: space sharing and placement
How do we realize an optimal allocation?
Gavel's round-based scheduling
Main questions
Gavel improves objectives on a heterogeneous cluster
Gavel can enable the same heterogeneous cluster to support higher input load
Gavel can support hierarchical policies
Gavel scales to clusters with hundreds of active jobs
Conclusion

Taught by

USENIX

Reviews

Start your review of Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.