Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Failure of AI Visual Reasoning in Vision Language Models

Discover AI via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
This video explores visual reasoning capabilities in AI systems, examining both the latest research algorithms and real-world applications. Dive into an analysis of "Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning," a paper published by researchers from Peking University, Beijing Academy of Artificial Intelligence, Chinese Academy of Sciences, and University of Chinese Academy of Sciences. Learn about the current limitations of visual reasoning in Vision Language Models (VLMs) through personal experiences with commercial AI systems. The 22-minute presentation provides insights into the gap between research claims and practical performance of visual AI reasoning technologies.

Syllabus

Failure of AI "Visual Reasoning" in VLMs

Taught by

Discover AI

Reviews

Start your review of Failure of AI Visual Reasoning in Vision Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.