I recently finished the Coursera course Design and Analysis of Algorithms I, given by Professor Tim Roughgarden of Stanford. This was my second online course from Coursera after taking Introduction to Databases last fall (which I wrote about here), and I thought it would be interesting to compare the two.
When I went to university (M.Sc. in Computer Science and Engineering), I took both an algorithm and data structures course, so a lot of the material wasn’t foreign to me. However, that was over 20 years ago, so I thought this would be a good refresher course.
There were several topics that were new, in particular some of the graph algorithms, such as Karger’s min-cut algorithm.
The course lasted 5 weeks and consisted of video lectures, multiple-choice quizzes and programming problems. The topics covered were:
• Merge Sort
• Big-Oh notation
• Algorithm for counting inversions
• Strassen’s subcubic matrix multiplication
• The Master Method
• Randomized selection
• Graphs and minimum cuts
• Graph search – breadth first and depth first
• Topological sort
• Graphs – strongly connected components
• Dijkstra’s shortest path algorithm
• Data structures: heaps
• Data structures: hash tables
The video lectures took about 2 hours per week, and were very easy to follow. Professor Roughgarden is quite good at explaining the different concepts and algorithms. My only wish is that I had the option of reading the material (as presented in a text book) instead of watching it. The slides used in the videos are available for download, but they don’t have enough information to be read on their own.
For me, it is faster to read, than to listen to someone explain something. It is easier to skim things I already know, and go back and forth in the text when something isn’t clear. I like the visual structure of the material on the pages, which is something I don’t get from a video. The main reason is that it would be faster. Taking the course is quite a large time commitment, especially when you work full time and have a family. If things can be sped up it’s a big plus.
This is of course is a matter of personal taste, and I suspect many people prefer having the material presented in lecture format. It is possible to speed up the videos (1.25 or 1.5 times the normal speed), but for me it still does not beat reading from a book.
The topics I already knew quite well were merge sort, big-oh notation, quicksort, and in particular hash tables, as I use them almost daily when coding. I thought all of these topics were well presented.
I had never heard about the master method before, but found both the formula for finding the big-oh performance of recursive algorithms and the derivation of it quite interesting. I also liked the graph algorithms, in particular, the algorithm for finding strongly connected components in a directed graph. The algorithm uses the neat trick of reversing all the edges as one of its steps. The lecture on heaps was interesting, as I studied heaps in my class at university, but had completely forgotten about them.
I had never heard about the master method before, but found both the formula for finding the big-oh performance of recursive algorithms and the derivation of it quite interesting.
There were five multiple choice quizzes, with 26 questions in total. You were only allowed two tries per quiz with your best attempt counting towards your total score. This is in contrast to the quizzes in the Introduction to Databases course, where the limit was set to 100, and the answer options were randomly shuffled for each new attempt. The questions were quite challenging, and even though I got them all correct in the end, I used my two attempts on all of them. When comparing the two approaches, I actually prefer the unlimited attempts of the Introduction to Databases course. You are motivated to keep trying, even if you don’t get them all right on the first two tries.
There were five programming assignments, which were all great for really understanding the different algorithms. I used Java, but you could use any language you want.
The first assignment dealt with inversions in an integer array. An inversion is a pair of two array indices, i and j with i<j, such that the value in the array at position i is greater than the value in the array at position j. If the array is sorted there are no inversions, and for all other cases there is at least one inversion. The problem consisted of an array of size 100,000, with all the numbers from 1 to 100,000 appearing exactly once, and the program had to calculate how many inversions the given array contained. This problem could be solved with an algorithm closely related to merge sort.
For the second programming problem, you had to implement quicksort, and use it to sort an integer array of size 10,000. You had to answer how many comparisons were used when sorting the array. The number of comparisons depends on the pivot elements used, and there were three cases to investigate: 1) using the first element of the sub-array as the pivot 2) using the last element of the sub-array as the pivot 3) using the median of the first, last and middle element of the sub-array as the pivot. Not surprisingly, using the median element resulted in the fewest number of comparisons.
In the third problem, you implemented the randomized contraction algorithm to find the minimum cut of a 40-node undirected graph. This was an interesting problem where you had to implement “folding” of two nodes into one, and handle the resulting change in edge structure. Unfortunately, unlike the previous two problems, the range of possible answers was very low, making it possible to guess the answer since eight attempts were allowed.
The fourth was by far the most difficult problem, judging from the discussions in the forum. You had to find the sizes of the five largest strongly connected components of a directed graph consisting of almost one million nodes. A strongly connected component is a group of nodes where there is a path from each node in the component to all the other nodes in that component. It has a nice recursive solution, but a lot of people had all sorts of issues due to the size of the graph, including speed problems and stack overflows.
The last programming assignment made use of a hash table (which you didn’t have to implement yourself) to see if nine given target sums could be found by adding two numbers from a list of 100,000 integers. Quite a neat use of a hash table, and also the easiest of the five programming problems.
As with the Introduction to Databases course, Design and Analysis of Algorithms I consisted of three components: 1) video lectures, 2) multiple-choice quizzes and 3) programming assignments. It’s a good model that makes sure you learn the material well. Another aspect of these online courses is that they follow a schedule. New videos are added weekly, with quizzes and assignments due each week. That system works really well to keep you going. The main difference with Introduction to Databases, was the length. Design and Analysis was considerably shorter. For me, that was actually better, because you need to put in quite a few hours of work each week, and a shorter course is more manageable.
Even though the course required quite a bit of work, I didn’t find it particularly difficult and ended up getting full marks on it. This is mostly because the teaching and exercises are first rate. I learned a lot, both from this course and from Introduction to Databases course, and highly recommend both of them.