
Training Overparametrized Neural Networks: Early Alignment Phenomenon and Simplicity Bias
Institut des Hautes Etudes Scientifiques (IHES) via YouTube
Overview

Udemy Special: Ends May 28!
Learn Data Science. Courses starting at $12.99.
Get Deal
This lecture explores the theoretical understanding of neural network training with first-order methods, focusing on the early alignment phenomenon in two-layer networks with small initializations. Discover how neurons align toward key directions during early training phases, creating a form of sparsity that can lead to local minima with surprisingly good generalization properties. Learn why these "spurious" local minima can actually yield lower excess risks than global minimizers of the training loss, demonstrating how early alignment contributes to a beneficial simplicity bias that improves test performance. Presented by Etienne Boursier from Inria and Université Paris-Saclay, this 57-minute talk provides insights into the implicit bias of optimization algorithms and their impact on neural network convergence.
Syllabus
Etienne Boursier - Training Overparametrized Neural Networks: Early Alignment Phenomenon and (...)
Taught by
Institut des Hautes Etudes Scientifiques (IHES)