Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How RWKV-7 "Goose" and Its Linear Inference Work

Oxen via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
This video features a deep dive into RWKV-7 "Goose" with paper author Eugene Cheah, exploring how this innovative model combines RNN architecture with transformer-like capabilities. Learn why RWKV-7 is generating excitement in the AI community, how to run it efficiently, and understand its fundamental architecture. The discussion covers how RWKV addresses traditional RNN limitations, examines the paper "Reinventing RNNs for the Transformer Era," and features direct insights from Eugene Cheah who explains the intuition behind each model layer. Discover the parallelization techniques used during training, review benchmark performance, see live evaluations, get fine-tuning tips, and understand the reasoning behind the World Tokenizer development. Perfect for AI researchers, developers, and enthusiasts interested in state-of-the-art language models that offer linear inference capabilities.

Syllabus

0:00 Why is RWKV-7 Goose interesting
2:53 How to quickly run RWKV-7 Goose
4:04 What is RWKV-7
10:20 RNN’s forget things
12:33 First paper: Reinventing RNNs for the Transformer Era
24:22 Paper author Eugene Cheah joins the dive
36:43 The intuition behind each model layer
47:57 Parallelization during training
53:01 How well did RWKV-7 do on benchmarks?
56:50 Live evals on RWKV-7 and fine-tuning tips
1:00:38 Why they made the World Tokenizer

Taught by

Oxen

Reviews

Start your review of How RWKV-7 "Goose" and Its Linear Inference Work

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.