Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How to Evaluate and Improve Your LLM Apps

Shaw Talebi via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
This 27-minute tutorial explores three essential evaluation methods for Large Language Model (LLM) applications and demonstrates practical implementation strategies. Learn how to conduct effective "vibe checks" before diving into formal evaluation techniques including code-based, human-based, and LLM-based evaluation approaches. The video provides a comprehensive walkthrough of each method with practical examples, including a detailed case study showing how to improve applications using LLM Judge. Follow along with the provided GitHub code examples and supplementary resources to implement these evaluation frameworks in your own AI projects. Perfect for developers and AI enthusiasts looking to systematically assess and enhance their LLM application performance.

Syllabus

Intro - 0:00
Vibe Checks - 0:27
Evals - 3:26
Type 1: Code-based - 5:58
Type 2: Human-based - 9:34
Type 3: LLM-based - 13:34
Example: Improving y2b with LLM Judge - 15:28

Taught by

Shaw Talebi

Reviews

Start your review of How to Evaluate and Improve Your LLM Apps

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.