Overview
In this 29-minute DevConf.IN 2025 talk, Kedar Basutkar introduces a revolutionary approach to video search that allows users to query video content using natural language, similar to database queries. Discover how this technology eliminates the need for manual scrubbing and delivers precise results quickly. The presentation begins with an overview of traditional video search challenges before moving into a live demonstration of natural language video querying with timestamp retrieval. Learn about the technical foundations, including how vision models generate descriptive captions for video frames, the role of embeddings in representing video content and user queries in a shared vector space, and the implementation of vector databases for efficient retrieval. Explore practical applications across surveillance footage analysis, media curation, and content-aware video search platforms. This talk is particularly valuable for developers, AI practitioners, and technology enthusiasts interested in the convergence of vision models, natural language processing, and media technologies.
Syllabus
Natural Language Meets Video: Querying Videos Like a Database - DevConf.IN 2025
Taught by
DevConf