Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Image Captioning and Question Answering Using BLIP-2 Model

Eran Feit via YouTube

Overview

Coursera Plus Annual Sale: All Certificates & Courses 25% Off!
This tutorial demonstrates how to use the BLIP-2 Visual Language Model from Hugging Face to generate image captions and answer questions about image content. Learn to implement a system that first describes images and then responds to specific queries about objects and colors within them. The 21-minute guide includes complete installation instructions and coding demonstrations with timestamps for easy navigation (introduction at 00:00, installation at 01:37, and coding at 09:41). Access the complete code via the provided Ko-fi link and explore more computer vision tutorials on the creator's blog and YouTube playlist. Connect with Eran Feit through various social platforms or support his work through Ko-fi or Patreon.

Syllabus

00:00 Introduction
01:37 Installation
09:41 Let's start coding ...

Taught by

Eran Feit

Reviews

Start your review of Image Captioning and Question Answering Using BLIP-2 Model

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.