Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Massachusetts Institute of Technology

Can Diffusion Model Disentangle? A Theoretical Perspective

Massachusetts Institute of Technology via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
This 22-minute talk by Liming Wang from MIT explores the theoretical foundations of how diffusion models can learn disentangled representations. Discover a novel theoretical framework that establishes identifiability conditions for general disentangled latent variable models, analyzes training dynamics, and derives sample complexity bounds for disentangled latent subspace models. Examine experimental validations across diverse tasks and modalities, including subspace recovery in latent subspace Gaussian mixture models, image colorization, image denoising, and voice conversion for speech classification. Learn how training strategies inspired by this theoretical approach, such as style guidance regularization, consistently enhance disentanglement performance. Wang, a postdoctoral associate in the Spoken Language Systems Group at MIT CSAIL, focuses his research on practical and theoretical aspects of self-supervised speech processing and multimodal learning to improve accessibility and inclusivity of speech and language technology.

Syllabus

Liming Wang, Can Diffusion Model Disentangle? A Theoretical Perspective

Taught by

MIT Embodied Intelligence

Reviews

Start your review of Can Diffusion Model Disentangle? A Theoretical Perspective

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.