A good algorithm usually comes together with a set of good data structures that allow the algorithm to manipulate the data efficiently. In this course, we consider the common data structures that are used in various computational problems. You will learn how these data structures are implemented in different programming languages and will practice implementing them in our programming assignments. This will help you to understand what is going on inside a particular built-in implementation of a data structure and what to expect from it. You will also learn typical use cases for these data structures.
A few examples of questions that we are going to cover in this class are the following:
1. What is a good strategy of resizing a dynamic array?
2. How priority queues are implemented in C++, Java, and Python?
3. How to implement a hash table so that the amortized running time of all operations is O(1) on average?
4. What are good strategies to keep a binary tree balanced?
You will also learn how services like Dropbox manage to upload some large files instantly and to save a lot of storage space!
Basic Data Structures
In this module, you will learn about the basic data structures used throughout the rest of this course. We start this module by looking in detail at the fundamental building blocks: arrays and linked lists. From there, we build up two important data structures: stacks and queues. Next, we look at trees: examples of how they’re used in Computer Science, how they’re implemented, and the various ways they can be traversed. Once you’ve completed this module, you will be able to implement any of these data structures, as well as have a solid understanding of the costs of the operations, as well as the tradeoffs involved in using each data structure.
Dynamic Arrays and Amortized Analysis
In this module, we discuss Dynamic Arrays: a way of using arrays when it is unknown ahead-of-time how many elements will be needed. Here, we also discuss amortized analysis: a method of determining the amortized cost of an operation over a sequence of operations. Amortized analysis is very often used to analyse performance of algorithms when the straightforward analysis produces unsatisfactory results, but amortized analysis helps to show that the algorithm is actually efficient. It is used both for Dynamic Arrays analysis and will also be used in the end of this course to analyze Splay trees.
Priority Queues and Disjoint Sets
We start this module by considering priority queues which are used to efficiently schedule jobs, either in the context of a computer operating system or in real life, to sort huge files, which is the most important building block for any Big Data processing algorithm, and to efficiently compute shortest paths in graphs, which is a topic we will cover in our next course. For this reason, priority queues have built-in implementations in many programming languages, including C++, Java, and Python. We will see that these implementations are based on a beautiful idea of storing a complete binary tree in an array that allows to implement all priority queue methods in just few lines of code. We will then switch to disjoint sets data structure that is used, for example, in dynamic graph connectivity and image processing. We will see again how simple and natural ideas lead to an implementation that is both easy to code and very efficient. By completing this module, you will be able to implement both these data structures efficiently from scratch.
In this module you will learn about very powerful and widely used technique called hashing. Its applications include implementation of programming languages, file systems, pattern search, distributed key-value storage and many more. You will learn how to implement data structures to store and modify sets of objects and mappings from one type of objects to another one. You will see that naive implementations either consume huge amount of memory or are slow, and then you will learn to implement hash tables that use linear memory and work in O(1) on average! In the end, you will learn how hash functions are used in modern disrtibuted systems and how they are used to optimize storage of services like Dropbox, Google Drive and Yandex Disk!
Binary Search Trees
In this module we study binary search trees, which are a data structure for doing searches on dynamically changing ordered sets. You will learn about many of the difficulties in accomplishing this task and the ways in which we can overcome them. In order to do this you will need to learn the basic structure of binary search trees, how to insert and delete without destroying this structure, and how to ensure that the tree remains balanced.
Binary Search Trees 2
In this module we continue studying binary search trees. We study a few non-trivial applications. We then study the new kind of balanced search trees - Splay Trees. They adapt to the queries dynamically and are optimal in many ways.
Pavel Pevzner, Alexander S. Kulikov and Michael Levin
This course really helped me understand the various data structures available when writing codes and the way the select the best data structure for any particular task. Prior to my taking this course, I only knew of arrays, lists and objects. Now, I learnt about trees, heaps, stacks, queues, etc. The assignments too were really challenging, prompting me to do a lot of personal research to pass, which just made the whole structure perfect. I recommend this course for anyone trying to understand what data structures is all about and the right applications of various data structures. Thank you so much for this amazing body of work.
Anonymous completed this course.
The quality of the material and the level of challenges proposed in this series organized by the University of California San Diego are really surprising. In this module, each week details different data structures and the computational complexity of...
The quality of the material and the level of challenges proposed in this series organized by the University of California San Diego are really surprising. In this module, each week details different data structures and the computational complexity of its operations. Then the concepts are verified in several algorithmic challenges, nothing trivial, automatically evaluated on its own platform through a battery of black box tests.
Just a few striking points:
* Very interesting to understand amortized analysis in dynamic arrays or hash tables and, even with the restructuring, how it remains weightless, considering that the additional cost is diluted in consecutive operations.
* Hash families and the guarantee of low collision, use of hashing in textual search (Rabin-Karp) and distributed hash table applications in cloud storage (Dropbox) and Big Data.
* Manipulation of large strings using Rope structures based on Splay Tree and how this is applicable in text editors.
* Take stack overflow even in delete/destroy functions, due to the high amount of data, and realize that all recursion can be restructured in terms of a Stack.
Once again I give five stars and strongly recommend.
Ivan Vyshnevskyi completed this course, spending 7 hours a week on it and found the course difficulty to be medium.
Really enjoyed the course! Starts from the basic structures: arrays, linked lists, trees, etc., and then goes to more advanced ones: priority queues, hash tables and balanced binary search trees (in particular AVL and splay „flavours“). About „desperately hard“ last week assignment: I don't agree with this comment. The last two weeks are all about balanced BSTs and there's only one assignment (with three problems) covering them which has stated expected time for completion 25 hours. But I believe it's this high only because of the advanced problem that really is much harder than the other two, but it's optional, so you can skip it. Anyway, I don't think one hard assignment in the otherwise great course warrants the 1-star review.
This is a wonderful course for learning about fundamental data structures and their super interesting applications. It was a delightful experience for me, personally. The only frustrating thing was Week 6's programming assignment, where the advanced problems were hard, and so I think there should've been more elaborate coverage of splay trees so that students could understand them better. Everything else was awesome!
Anonymous completed this course.
Very nice course, a lot of new and important information for me. I like programming assignments because they help to deeply understand the material.
Basically, programming assignments is the best thing in this course and specialization!
Excellent course, I learnt a lot.
The programming assignments are really challenging, so assume the course to take way longer than the estimated amount.
If you still decide to go through with it, the lectures are great, and the subjects are really interesting.
I'm amazed at the solutions past programmers have found to such complex problems as the ones presented throughout the course.