Felienne Hermans is associate professor at the Leiden Institute of Advanced Computer Science, where she leads a research group interested in programming education — how to teach and learn programming effectively — a topic distilled in her book, The Programmer’s Brain.
In my experience as a computer science grad, in programming education, the emphasis is on writing code rather than reading code. And at first, that seems to make sense. After all, if we read code, it’s typically to write code after: reading code isn’t the end goal.
That said, when you view code reading as a skill in and of itself, and when you consider it in light certains concepts of cognitive theory, you can discern ways to get better at it: there are techniques to read code more effectively. And that’s precisely the topic of Felienne’s talk.
The Challenges of Reading Code
Felienne explains that in the programming community, there’s a tendency to encourage novices to skip ahead and jump straight to coding, learning as you go — for instance, by building an app — thereby turning learning programming into a byproduct of another goal.
This approach has its benefits. For example, it can be motivating to work on your app idea early on. But it can also have the opposite effect: if you keep hitting roadblocks because you’re trying to run before you can walk, you’ll soon run out of steam. You need some fundamentals first.
Similarly, reading someone else’s code can quickly become overwhelming — most notably, for novice programmers. However, here too, some fundamental concepts and techniques can help considerably. Felienne outlines three issues novice programmers face and explains how to overcome them.
“Just Google it…” If you’ve been coding for any length of time, this is something you’ve like heard before. And while going back to Google and Stack Overflow to refresh your memory is something all programmers do, over-reliance on these tools can be a crutch.
Some aspects of programming are worth committing to long-term memory, just like the meaning of basic words is worth memorizing when learning a foreign language: otherwise, you’ll keep having to go back to the dictionary, considerably slowing you down.
When trying to decipher a line of code, by the time you reach the end, you might have forgotten how the line began. This doesn’t just happen when reading code; it can happen when reading a novel, for instance. But code can be very compact and abstract. A lot of complex stuff can be “said” in a few lines. So it’s an activity that can push our short-term memory to its limits. And this is where chunking comes into play.
When you start learning to read, you typically go bottom up. First, you’re taught the alphabet. Then, you’re taught how to combine letters into words. Initially, you may think of words as collections of letters: going over the letters one by one, you can turn them into syllables, and then, turn the syllables into the words. And when you’re starting to read, deciphering words can be a slow process. But you rapidly get faster by seeing words as single “chunks” rather than as collections of individual letters.
This process is called chunking, and it doesn’t apply just to letters and words. You can chunk many concepts, including coding concepts, such as constructs particular to a certain programming language. And once you’ve chunked the fundamental constructs of a language, you’re much faster at reading it.
Chunks have another benefit: if you face an unfamiliar construct, you may be able to rewrite it in a way that is more familiar and understandable to you by leveraging your prior knowledge — that is, your library of chunks. This is what Felienne calls cognitive refactoring.
Finally, even when you understand the syntax of a programming language, another common issue is getting overwhelmed by the sheer amount of information. This is especially true when you’re trying to trace code execution in your head: keeping track of variables and their values as code executes. Loops, in particular, can be tricky to handle, since they can quickly overload our working memory.
In those cases, a good approach is to use pen and paper. Write down the values of variables, and write down how these values are updated as the code executes. In cognitive theory, this is called distributed cognition: in this example, we’re using the paper to extend our working memory and support our reasoning.
As you apply these techniques, the language idiosyncrasies will becomes increasingly familiar to you, and you’ll discern recurring patterns that you’ll be able to learn on to further your understanding. And gradually, reading code will become easier.
To learn more about the subject, I encourage you to watch Felienne’s excellent talk below.