Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Few Shot Code Generation to Autonomous Software Engineering Agents - From Benchmarks to Multimodal Systems

MLOps.community via YouTube

Overview

Explore groundbreaking research on autonomous software engineering agents in this MLOps.community conference talk. Delve into three key works that demonstrate the potential of using software engineering as a testing ground for next-generation language models. Learn about SWE-bench, a benchmark system evaluating AI's ability to solve real GitHub issues across 2,294 Python repository tasks, and discover SWE-agent, an autonomous system achieving a 12.5% resolved rate on the SWE-bench test set. Examine the implications of SWE-bench Multimodal's findings from 617 JavaScript repository tasks, which highlight the importance of generalizability in AI systems and reveal potential Python-specific biases in existing coding agents. Presented by Stanford University PhD student John Yang, whose research focuses on Language Agents, Language Model Evaluation, and Software Engineering.

Syllabus

Few Shot Code Generation to Autonomous Software Engineering Agents // John Yang

Taught by

MLOps.community

Reviews

Start your review of Few Shot Code Generation to Autonomous Software Engineering Agents - From Benchmarks to Multimodal Systems

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.