Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Safe Evaluation and Rollout of AI Models

USENIX via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
This conference talk from SREcon25 Americas explores the challenges and solutions for safely evaluating and deploying AI models in production environments. Brendan Burns from Microsoft shares practical insights from the development of Azure Copilot, focusing on the unique reliability challenges posed by AI systems. Learn how to implement effective evaluation frameworks for new models and prompts where performance isn't simply "working" or "broken" but requires probabilistic assessment across numerous user interactions. Discover methodologies for determining when model changes represent improvements versus regressions that require fixes or rollbacks. The presentation provides hands-on approaches currently used in production systems to maintain reliability when AI models form core components of user experiences.

Syllabus

SREcon25 Americas - Safe Evaluation and Rollout of AI Models

Taught by

USENIX

Reviews

Start your review of Safe Evaluation and Rollout of AI Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.