Explore a comprehensive case study detailing Amazon's successful transition of core exabyte-scale data catalog management jobs from Spark to Ray. Delve into the key milestones, challenges, concessions, and future vision for incorporating Ray into critical batch and streaming business intelligence pipelines at Amazon. Learn about techniques for developing large-scale serverless Ray job management infrastructure, build and deploy management strategies, risk mitigation approaches used to facilitate the migration, and operational excellence methods employed to ensure a smooth production rollout. Gain valuable insights into scaling AI workloads and managing distributed systems in a production environment. Access the accompanying slide deck for visual references and additional details on this exabyte-scale migration project.
Overview
Syllabus
From Spark to Ray: An Exabyte-Scale Production Migration Case Study
Taught by
Anyscale