Completed
Standard Deviation for model parameters
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Create a Large Language Model from Scratch with Python – Tutorial
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro
- 2 Install Libraries
- 3 Pylzma build tools
- 4 Jupyter Notebook
- 5 Download wizard of oz
- 6 Experimenting with text file
- 7 Character-level tokenizer
- 8 Types of tokenizers
- 9 Tensors instead of Arrays
- 10 Linear Algebra heads up
- 11 Train and validation splits
- 12 Premise of Bigram Model
- 13 Inputs and Targets
- 14 Inputs and Targets Implementation
- 15 Batch size hyperparameter
- 16 Switching from CPU to CUDA
- 17 PyTorch Overview
- 18 CPU vs GPU performance in PyTorch
- 19 More PyTorch Functions
- 20 Embedding Vectors
- 21 Embedding Implementation
- 22 Dot Product and Matrix Multiplication
- 23 Matmul Implementation
- 24 Int vs Float
- 25 Recap and get_batch
- 26 nnModule subclass
- 27 Gradient Descent
- 28 Logits and Reshaping
- 29 Generate function and giving the model some context
- 30 Logits Dimensionality
- 31 Training loop + Optimizer + Zerograd explanation
- 32 Optimizers Overview
- 33 Applications of Optimizers
- 34 Loss reporting + Train VS Eval mode
- 35 Normalization Overview
- 36 ReLU, Sigmoid, Tanh Activations
- 37 Transformer and Self-Attention
- 38 Transformer Architecture
- 39 Building a GPT, not Transformer model
- 40 Self-Attention Deep Dive
- 41 GPT architecture
- 42 Switching to Macbook
- 43 Implementing Positional Encoding
- 44 GPTLanguageModel initalization
- 45 GPTLanguageModel forward pass
- 46 Standard Deviation for model parameters
- 47 Transformer Blocks
- 48 FeedForward network
- 49 Multi-head Attention
- 50 Dot product attention
- 51 4:19:43 Why we scale by 1/sqrtdk
- 52 Sequential VS ModuleList Processing
- 53 Overview Hyperparameters
- 54 Fixing errors, refining
- 55 Begin training
- 56 OpenWebText download and Survey of LLMs paper
- 57 How the dataloader/batch getter will have to change
- 58 Extract corpus with winrar
- 59 Python data extractor
- 60 Adjusting for train and val splits
- 61 Adding dataloader
- 62 Training on OpenWebText
- 63 Training works well, model loading/saving
- 64 Pickling
- 65 Fixing errors + GPU Memory in task manager
- 66 Command line argument parsing
- 67 Porting code to script
- 68 Prompt: Completion feature + more errors
- 69 nnModule inheritance + generation cropping
- 70 Pretraining vs Finetuning
- 71 R&D pointers
- 72 Outro