Overview
The course teaches learners how to enable on-device learning by employing 8-bit fixed-point (INT8) quantization in both forward and backward passes over a deep model. The key skills taught include implementing Loss-aware Compensation (LAC) and Parameterized Range Clipping (PRC) to optimize computation, building a compensation layer to counteract quantization errors, and using the Octo system for on-device learning. The teaching method involves a detailed exploration of the INT8 training method, including the co-design of network and training engine, loss-aware compensation, backward quantization, and evaluation setup. This course is intended for individuals interested in on-device learning, edge intelligence, and optimizing deep learning models for resource-constrained environments.
Syllabus
Intro
Rise of On-device Learning
Common Compression Methods
The Workflow of DNN Training
Bridge the Gap: Data Quantization
Why We Need Quantization?
Potential Gains
Co-design of Network and Training Engine
Our System: Octo
Loss-aware Compensation
Backward Quantization
Evaluation Setup
Convergence Results
Ablation Study: Impact of LAC and PRC
Image Processing Throughput
Deep Insight of Feature Distribution Visualization of intermediate Feature Distribution
System Overhead
Conclusion
Taught by
USENIX