Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

AI Workload Preemption in a Multi-Cluster Scheduling System at Bloomberg

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
Explore how Bloomberg implemented Karmada's Priority and Preemption feature to efficiently manage machine learning workloads across multiple clusters in this 28-minute conference talk from CNCF. Leon Zhou and Wei-Cheng Lai from Bloomberg discuss their approach to ensuring high-impact AI workloads receive priority access to GPU resources as the company's AI usage grows rapidly. Learn about the challenges of balancing resource allocation between high-priority and lower-priority ML batch jobs, and how Karmada helps prevent business-critical workloads from being starved of resources during high contention periods. Gain practical insights into configuring and managing multi-cluster environments while maintaining efficient execution of ML jobs. This presentation is particularly valuable for Kubernetes administrators and engineers responsible for managing large-scale machine learning workloads.

Syllabus

AI Workload Preemption in a Multi-Cluster Scheduling System at Bloomberg - Leon Zhou & Wei-Cheng Lai

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of AI Workload Preemption in a Multi-Cluster Scheduling System at Bloomberg

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.