Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Envoy Proxy: Evolved for Serving LLMs

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Plus Monthly Sale: All Certificates & Courses 40% Off!
This conference talk explores how Envoy proxy has been enhanced to address the unique challenges of deploying Large Language Models (LLMs) efficiently in production environments. Learn about the specific challenges of deploying and scaling LLMs in production and discover how Envoy's latest features optimize LLM serving, improve performance, and simplify integration into Kubernetes-native architectures. The speakers from Google delve into advanced load balancing techniques for LLM inference that intelligently route requests to optimize resource utilization and minimize latency, explain how Envoy can be instrumented for compatibility with popular LLM serving specifications such as OpenAI API specifications, and discuss security considerations for LLMs, including how to attach AI Safety frameworks in the Envoy proxy dataplane.

Syllabus

Envoy Proxy: Evolved for Serving LLMs - Vaibhav Katkade & Andres Guedez, Google

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Envoy Proxy: Evolved for Serving LLMs

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.