Distributed System Design for Data Engineering

Overview

This course on Distributed System Design for Data Engineering aims to provide an understanding of distributed system design principles and their applications in data engineering. By the end of the course, learners will be able to design robust and scalable data systems by mastering concepts such as data partitioning, replication, fault tolerance, scalability, and data consistency. The teaching method involves real-world examples and discussions on best practices and trade-offs in system design. This course is intended for individuals interested in modern data engineering challenges and seeking to enhance their skills in building large-scale data systems.

Syllabus

– Introduction
– What is a Distributed System
– Key concepts in distributed systems
– Fault Tolerance
– Replication
– Synchronous VS Asynchronous Replication
– Replication Models
– Quorums