In this lecture from the Simons Institute, Boaz Barak of Harvard University explores the concept of AI safety through inference-time compute limitations. Discover how computational constraints during model inference can potentially provide safety guarantees for Large Language Models (LLMs). The hour-long presentation delves into theoretical frameworks and practical approaches for implementing safety measures that don't rely solely on training-time interventions but leverage computational boundaries during model execution to prevent harmful outputs or behaviors.
Overview
Syllabus
AI Safety Via Inference-Time Compute
Taught by
Simons Institute