The course aims to teach learners how to accelerate Deep Neural Network (DNN) training through joint optimization of algebraic transformations and parallelization. Students will learn about representing parallelization and algebraic transformations as substitutions on a unified parallel computation graph (PCG) to optimize distributed DNN training procedures. The course covers techniques for automatically generating graph substitutions, hierarchical search algorithms for optimization, and formal verification using an automated theorem prover. The teaching method involves a combination of theoretical concepts, practical demonstrations, and real-world evaluations. This course is intended for individuals interested in optimizing distributed DNN training, including researchers, data scientists, and machine learning engineers.
Overview
Syllabus
Introduction
Unitys Goal
Parallelization
Parallel Computation Graph
Data Parallelization
PCG Advantages
Techniques
Results
Conclusion
Taught by
USENIX