Overview
Explore a 16-minute conference talk from USENIX FAST '23 that introduces HadaFS, an innovative file system designed to bridge local and shared burst buffers for exascale supercomputers. Delve into the Localized Triage Architecture (LTA) that addresses ultra-scale expansion and data sharing challenges. Learn about the full-path indexing approach and metadata synchronization strategies that tackle complex metadata management issues in traditional file systems. Discover how HadaFS integrates Hadash, a data management tool that enhances data query efficiency and accelerates data migration between burst buffers and traditional HPC storage. Gain insights into HadaFS's deployment on the Sunway New-generation Supercomputer (SNS), its ability to serve hundreds of applications, and its support for up to 600,000-client scaling. Examine performance evaluations, including metadata and data performance, data migration, and interference avoidance techniques.
Syllabus
Intro
Burst Buffer in Typical HPC Storage Systems
Challenges with Burst Buffer (3/3)
Sample BB System: Sunway New-generation Supercomputer
Localized Triage Architecture
Namespace and Metadata Handling (2/2)
HadaFS I/O Control and Data Flow
Data Forwarding Overhead (2/2)
Metadata performance evalutaion
Data Performance Evalutaion (2/2)
Data Migration Evalutaion
Interference Avoidance
Taught by
USENIX