This conference talk presents GeminiFS, a companion file system designed specifically for GPUs that enables direct file-based access to NVMe storage while being managed by the host file system. Learn how this innovative solution addresses the limitations of both CPU-centric and current GPU-centric storage approaches by providing file abstraction and management functionalities needed for GPU-accelerated machine learning applications like GNNs and LLMs. Discover how GeminiFS achieves metadata synchronization between host and GPU file systems, extends NVMe drivers to allow parallel control plane setup, and implements a GPU-friendly software-defined page cache to maximize GPU internal bandwidth. The presentation also covers libGemini, a library that simplifies GPU programming by abstracting underlying complexities. See evaluation results demonstrating GeminiFS's superior performance compared to existing storage solutions for large-scale machine learning workloads.
Overview
Syllabus
FAST '25 - GeminiFS: A Companion File System for GPUs
Taught by
USENIX