Explore a groundbreaking video presentation from PLDI 2024 that addresses a critical shortcoming in sparse tensor algebra compilation. Learn how researchers from Stanford University developed a compiler design that generates efficient code for scattering values into sparse result tensors using sparse intermediate tensors (sparse workspaces). Discover the algorithm template for workspace insertion that forms the backbone of their code generation algorithm, supporting modular and user-defined implementations. Examine the evaluation results showing significant performance improvements, with sparse workspaces achieving up to 27.12× faster execution than dense workspaces in certain scenarios. Understand the trade-offs between sparse and dense workspaces, and how this compiler design supports both for optimal performance across various situations. Gain insights into how this approach produces competitive sequential code generalizable to any expression while potentially decreasing memory usage asymptotically for tensor computations on large datasets.
Overview
Syllabus
[PLDI24] Compilation of Modular and General Sparse Workspaces
Taught by
ACM SIGPLAN