Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

EIT Digital

I/O-efficient algorithms

EIT Digital via Coursera

Overview

Prepare for a new career with $100 off Coursera Plus
Gear up for jobs in high-demand fields: data analytics, digital marketing, and more.
I/O-efficient algorithms, also known as external memory algorithms or cache-oblivious algorithms, are a class of algorithms designed to efficiently process data that is too large to fit entirely in the main memory (RAM) of a computer. These algorithms are particularly useful when dealing with massive datasets, such as those found in large-scale data processing, database management, and file systems.

Operations on data become more expensive when the data item is located higher in the memory hierarchy. An operation on data in CPU registers is roughly a million times faster than an operation on a data item that is located in external memory that needs to be fetched first. These data fetches are also called I/O operations and need to be taken into account during the design of an algorithm. The goal of this course is to become familiar with important algorithmic concepts and techniques needed to effectively deal with such problems. We will work with a simplified memory hierarchy, but the notions extend naturally to more realistic models.

Prerequisites:
In order to successfully take this course, you should already have a basic knowledge of algorithms and mathematics. Here's a short list of what you are supposed to know:
- O-notation, Ω-notation, Θ-notation; how to analyze algorithms
- Basic calculus: manipulating summations, solving recurrences, working with logarithms, etc.
- Basic probability theory: events, probability distributions, random variables, expected values etc.
- Basic data structures: linked lists, stacks, queues, heaps
- (Balanced) binary search trees
- Basic sorting algorithms, for example MergeSort, InsertionSort, QuickSort
- Graph terminology, representations of graphs (adjacency lists and adjacency matrix), basic graph algorithms (BFS, DFS, topological sort, shortest paths)

The material for this course is based on the course notes that can be found under the resources tab. We will not cover everything from the course notes. The course notes are there both for students who did not fully understand the lectures as well as for students who would like to dive deeper into the topics.

The video lectures contain a few very minor mistakes. A list of these mistakes can be found under resources. If you think you found an error, report a problem by clicking the square flag at the bottom of the lecture or quiz where you found the error.

Syllabus

  • Introduction
    • In this module we give an introduction to the course I/O-efficient algorithms. We discuss the so-called I/O-model, which consists of an internal memory of limited size, an external memory of unlimited size and where data transfer between these two happens in blocks of a given size. We give a simple example showing that the actual running time of an algorithm working on data in external memory is greatly influenced by its I/O-behavior. Finally, we discuss the basics of analyzing algorithms in the I/O-model.
  • Designing cache-aware and cache-oblivious algorithms
    • In this module we discuss two techniques to design I/O-efficient algorithms, using the matrix-transposition problem as a running example. The first technique is a "tile-based" approach and leads to a cache-aware algorithm. The second technique uses a recursive approach and leads to a cache-oblivious algorithm.
  • Replacement Policies
    • When we want to read something from external memory while the internal memory is full we need to make room by evicting a block from internal memory. The block which should be evicted is decided by the replacement policy. In this module we introduce LRU and some other some well-known replacement policies, and investigate the I/O-efficiency of LRU compared to an optimal replacement policy.
  • I/O-efficient sorting
    • In this module we analyze the I/O-efficiency of MergeSort and discuss how to adapt it to make it more I/O-efficient.
  • I/O-efficient data structures
    • In this module we introduce some I/O-efficient data structures: B-trees and buffer trees, and an I/O-efficient priority queue based on buffer trees.
  • Time-Forward Processing
    • In this module we discuss time-forward processing, a technique that can be used to evaluate so-called local functions on a directed acyclic graph.

Taught by

Mark de Berg

Reviews

4.6 rating at Coursera based on 59 ratings

Start your review of I/O-efficient algorithms

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.