Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Probabilistic Safety Guarantees Using Model Internals

Simons Institute via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
This talk by Jacob Hilton from the Alignment Research Center explores how to establish probabilistic safety guarantees for large language models by examining their internal mechanisms. Learn about innovative approaches to ensuring AI safety through model internals analysis, as presented at the Simons Institute's Safety-Guaranteed LLMs event. The 46-minute presentation delves into technical methods for creating more reliable safety assurances in advanced AI systems.

Syllabus

Probabilistic Safety Guarantees Using Model Internals

Taught by

Simons Institute

Reviews

Start your review of Probabilistic Safety Guarantees Using Model Internals

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.