Unlocking the Future of GPU Scheduling in Kubernetes with Reinforcement Learning
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Explore a conference talk that delves into innovative approaches for GPU scheduling in Kubernetes using Reinforcement Learning (RL). Learn how to address common challenges in Multi GPU setups, including resource fragmentation and low utilization issues that impact performance and costs. Discover why RL stands out as an optimal solution, with its ability to adapt to dynamic environments and handle complex, multi-dimensional objectives within Kubernetes clusters. Gain valuable insights into state-of-the-art RL algorithms for scheduling, their implementation in Kubernetes, and the potential applications of Reinforcement Learning from Human Feedback (RLHF). Perfect for AI and cloud computing professionals interested in optimizing GPU resource management in large-scale machine learning projects.
Syllabus
Unlocking the Future of GPU Scheduling in Kubernetes with Reinforcement Learning- N. Goyal, A. Gupta
Taught by
CNCF [Cloud Native Computing Foundation]