Speed up Your ML Workloads With Kubernetes Powered In-memory Data Caching
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
This conference talk explores how to accelerate machine learning workloads through Kubernetes-powered in-memory data caching. Presented by Rasik Pandey and Akshay Chitneni from Apple, the 26-minute session addresses the challenges of repetitive data access during model training, which can be both slow and costly in cloud environments, often leaving GPU resources idle. Learn about an optimized data caching solution that combines Apache Iceberg, Apache Arrow Flight, and Kubernetes to improve training efficiency. The presenters demonstrate a distributed in-memory cache implementation across Kubernetes pods that enhances data loading for Kubeflow training workloads. This solution is particularly relevant in the GenAI era where datasets and training workloads continue to grow in size and complexity, making efficient data access increasingly critical.
Syllabus
Speed up Your ML Workloads With Kubernetes Powered In-memory Data... Rasik Pandey & Akshay Chitneni
Taught by
CNCF [Cloud Native Computing Foundation]