Overview
Join this Allen School Colloquia Series talk where Simran Arora from Stanford University discusses "Pareto-efficient AI systems: Expanding the quality and efficiency frontier of AI." Explore how to build language model architectures that optimize the balance between quality and throughput efficiency. The presentation is structured in three parts: first, understanding the fundamental tradeoffs between different architecture classes; second, evaluating existing architecture candidates with the ThunderKittens programming library; and third, expanding the Pareto frontier with the BASED architecture, which has led to state-of-the-art Transformer-free language models developed on an academic budget. Arora, a PhD student advised by Chris Ré, blends AI and systems research to maximize AI capabilities while maintaining efficiency, with her work receiving recognition at major conferences including NeurIPS, ICML, and ICLR.
Syllabus
Expanding the quality and efficiency frontier of AI—Simran Arora (Stanford)
Taught by
Paul G. Allen School