Google ML and AI: What is Content Caching - Use in Vertex AI
The Machine Learning Engineer via YouTube
Overview
Learn what Context Caching is in this 20-minute tutorial that demonstrates how developers can cache frequently used input tokens in a dedicated cache. Discover how this feature reduces the number of tokens sent to models, resulting in lower costs and faster request processing by eliminating the need to repeatedly process the same content. Follow along with a practical example showing how to implement Context Caching with PDF documents stored in Google Cloud Storage buckets and integrated with the Gemini model, comparing response times with and without caching enabled. Note that the notebook and code examples are available only to paying subscribers by contacting [email protected].
Syllabus
Google ML and AI: What is Content Catching. Use in Vertex Ai #datascience #machinelearning
Taught by
The Machine Learning Engineer