Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Building Makemore - Activations & Gradients, BatchNorm

Andrej Karpathy via YouTube

Overview

The course delves into the internals of Multilayer Perceptrons (MLPs) with a focus on activations, gradients, and the importance of proper scaling. It covers diagnostic tools and visualizations for understanding deep network health, introduces Batch Normalization as a key innovation for training deep neural networks, and hints at upcoming topics like residual connections and the Adam optimizer. The course includes practical exercises to deepen understanding, and the teaching method involves a mix of theoretical explanations, code demonstrations, and visualizations. This course is intended for individuals interested in deep learning, neural networks, and improving the training process of deep neural networks.

Syllabus

intro
starter code
fixing the initial loss
fixing the saturated tanh
calculating the init scale: “Kaiming init”
batch normalization
batch normalization: summary
real example: resnet50 walkthrough
summary of the lecture
just kidding: part2: PyTorch-ifying the code
viz #1: forward pass activations statistics
viz #2: backward pass gradient statistics
the fully linear case of no non-linearities
viz #3: parameter activation and gradient statistics
viz #4: update:data ratio over time
bringing back batchnorm, looking at the visualizations
summary of the lecture for real this time

Taught by

Andrej Karpathy

Reviews

Start your review of Building Makemore - Activations & Gradients, BatchNorm

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.