Building an Auto-scaling AI Inference Service - From Setup to Deployment

Building an Auto-scaling AI Inference Service - From Setup to Deployment

Trelis Research via YouTube Direct link

- Introduction to AI Inference Scaling

1 of 12

1 of 12

- Introduction to AI Inference Scaling

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Building an Auto-scaling AI Inference Service - From Setup to Deployment

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Introduction to AI Inference Scaling
  2. 2 - Video Agenda Overview
  3. 3 - Different Inference Approaches
  4. 4 - Understanding GPU Utilization
  5. 5 - Setting Up One-Click Templates
  6. 6 - Docker Image Configuration
  7. 7 - Building Auto-Scaling Service
  8. 8 - Model Configuration Settings
  9. 9 - Load Testing and Metrics
  10. 10 - Scaling Manager Implementation
  11. 11 - Setting Up API Endpoint
  12. 12 - Conclusion and Future Topics

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.