Overview
Explore the evolution of Alibaba's Pangu storage system in this 15-minute conference talk from USENIX FAST '23. Discover how Pangu adapted to hardware advancements and changing business models to deliver high-performance, reliable storage services with 100-microsecond I/O latency. Learn about the system's two-phase evolution, including the integration of SSD storage and RDMA network technologies, as well as the shift from volume-oriented to performance-oriented storage. Gain insights into key design innovations such as traffic amplification reduction, remote direct cache access, and CPU computation offloading. Understand how these improvements allowed Pangu to fully leverage hardware upgrades, including increased SSD volume and RDMA bandwidth. Benefit from shared operational experiences and important lessons learned during Pangu's development and implementation in Alibaba's cloud infrastructure.
Syllabus
Intro
Pangu: The Unified Storage Platform in Alibaba Cloud
Storage Architecture based on Pangu2.0
Performance Design Challenges for Pangu2.0
Full Stack Optimization for Latency
Methods to Guarantee Performance SLA
Latency and SLA Evaluation
Higher I/O Efficiency, Better Performance
Breaking Network Bottleneck to Improve I/O Efficiency
Breaking Memory Bottleneck to Improve I/O Efficiency
Taught by
USENIX