Completed
0:00 - Intro
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Visually Explaining Mixture of Experts LLMs like DeepSeek and Mixtral - How to Code
Automatically move to the next video in the Classroom when playback concludes
- 1 0:00 - Intro
- 2 1:52 - Mixture of Experts Intuition
- 3 4:53 - Transformers 101
- 4 9:20 - Dense MOEs
- 5 14:50 - Sparse MOEs
- 6 16:34 - Router Collapse and Top-K Routing
- 7 19:20 - Noisy TopK, Load Balancing
- 8 20:56 - Routing Analysis by Mixtral
- 9 22:30 - Auxilliary Losses & DeepSeek
- 10 24:05 - Expert Capacity
- 11 26:07 - 6 Points to Remember