
Overview

FLASH SALE: Ends May 22!
Udemy online courses up to 85% off.
Get Deal
Explore effective Mesos cluster utilization for non-production jobs in this insightful conference talk by Dmitry Zhuk from Twitter. Delve into the challenge of idle resources consuming cluster capacity and learn about an innovative approach to detect and reallocate these resources using the Mesos oversubscription model. Discover techniques for isolating non-production jobs from production workloads, and gain valuable insights into the issues surrounding Mesos' oversubscription approach. Examine potential solutions and workarounds as Zhuk shares his expertise on improving overall cluster efficiency. Throughout the 41-minute presentation, cover topics such as cluster workload analysis, idle task examples, CPU utilization in non-production environments, and the implementation of idle resource oversubscription. Gain a comprehensive understanding of idleness detection, estimated savings, oversubscribable resources, and QoS correction strategies. Explore the intricacies of resource estimation, QoS controllers, and the challenges of isolation and scale in large cluster environments.
Syllabus
Intro
Cluster workload
Idle task example
Non-production CPU utilization
Solution
Constraints
Oversubscription in Mesos
Resource Estimator
QoS Controller
Idle resources oversubscription
Idleness detection
Estimated savings
Oversubscribable resources
QoS correction strategy
Recover: idleness preservation
Reconstruction - features
Workaround
Implementation details
Configuration
Isolation issues
Scale issues
Current state
Taught by
Linux Foundation