Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

When Machine Learning Isn't Private

USENIX Enigma Conference via YouTube

Overview

This course explores the privacy issues surrounding machine learning models, specifically focusing on the leakage of training data. The learning outcomes include understanding how adversaries can extract personally-identifiable information from models like GPT-2 and the challenges in preventing such leakage. The course teaches the concept of differential privacy as a secure solution, albeit with a trade-off in utility. The teaching method involves a lecture format with a presentation on the privacy problem and potential solutions. The intended audience for this course includes researchers looking to address privacy concerns in machine learning models and practitioners seeking practical techniques to test for data memorization.

Syllabus

THE ADVANCED COMPUTING SYSTEMS ASSOCIATION
Do models leak training data?
Act I: Extracting Training Data
A New Attack: : Training Data Extraction
1. Generate a lot of data 2. Predict membership
Evaluation
Up to 5% of the output of language models is verbatim copied from the training dataset
Case study: GPT-2
Act II: Ad-hoc privacy isn't
Act III: Whatever can we do?
3. Use differential privacy
Questions?

Taught by

USENIX Enigma Conference

Reviews

Start your review of When Machine Learning Isn't Private

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.