Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Building Service Ownership Using Documentation, Telemetry, and a Chance to Make Things Better

USENIX via YouTube

Overview

This course aims to teach learners how to build scalable and distributed people systems for operating software at scale. The learning outcomes include understanding the importance of documentation, telemetry, and clear objectives in building service ownership. The course covers topics such as distributed tracing, centralized documentation, improving on-call processes, setting SLOs, and empowering teams to make improvements. The teaching method involves a lecture format with real-world examples and practical strategies. This course is intended for software engineers, DevOps professionals, SREs, and anyone involved in building and operating distributed software systems.

Syllabus

Intro
Service ownership, defined
Obstacles to successful service ownership
Distributed tracing, defined
Relationships matter
Traces = raw material, not finished product
Centralized documentation
Why is documentation important?
Iterating toward ownership
More context -- mitigating facter
Dynamic alert delivery
Handling alerts
Improving postmortems
Postmortems are documentation
Why is improving oncall important?
Determining SLOS
Derive internal SLOs using tracing
Why are SLOs important?
3-piece puzzle review
Making changes
Ownership = Accountability + Agency

Taught by

USENIX

Reviews

Start your review of Building Service Ownership Using Documentation, Telemetry, and a Chance to Make Things Better

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.