Visually Explaining Mixture of Experts LLMs like DeepSeek and Mixtral - How to Code

Visually Explaining Mixture of Experts LLMs like DeepSeek and Mixtral - How to Code

Neural Breakdown with AVB via YouTube Direct link

0:00 - Intro

1 of 11

1 of 11

0:00 - Intro

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Visually Explaining Mixture of Experts LLMs like DeepSeek and Mixtral - How to Code

Automatically move to the next video in the Classroom when playback concludes

  1. 1 0:00 - Intro
  2. 2 1:52 - Mixture of Experts Intuition
  3. 3 4:53 - Transformers 101
  4. 4 9:20 - Dense MOEs
  5. 5 14:50 - Sparse MOEs
  6. 6 16:34 - Router Collapse and Top-K Routing
  7. 7 19:20 - Noisy TopK, Load Balancing
  8. 8 20:56 - Routing Analysis by Mixtral
  9. 9 22:30 - Auxilliary Losses & DeepSeek
  10. 10 24:05 - Expert Capacity
  11. 11 26:07 - 6 Points to Remember

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.