This conference talk explores the mysterious inner workings of Chain-of-Thought (CoT) reasoning in advanced AI systems. Delve into the current limitations of our understanding of how Large Reasoning Models (LRMs) like OpenAI o1 and DeepSeek-R1 generate impressive reasoning capabilities through scaled test-time compute and extended Chain-of-Thought processes. Discover what researchers know—and don't know—about CoT as of March 2025, examining the "black box" nature of AI reasoning systems. The presentation references groundbreaking research including "Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought" (2025) by Jianhao Huang, Zixuan Wang, and Jason D. Lee from Shanghai Jiaotong University and Princeton University, as well as "How Transformers Learn Causal Structure with Gradient Descent" (2024) from Princeton University researchers.
Overview
Syllabus
The CoT Black Box: Beyond our Knowledge
Taught by
Discover AI