This course will discuss the major ideas used today in the implementation of programming language compilers, including lexical analysis, parsing, syntax-directed translation, abstract syntax trees, types and type checking, intermediate languages, dataflow analysis, program optimization, code generation, and runtime systems. As a result, you will learn how a program written in a high-level language designed for humans is systematically translated into a program written in low-level assembly more suited to machines. Along the way we will also touch on how programming languages are designed, programming language semantics, and why there are so many different kinds of programming languages.
The course lectures will be presented in short videos. To help you master the material, there will be in-lecture questions to answer, quizzes, and two exams: a midterm and a final. There will also be homework in the form of exercises that ask you to show a sequence of logical steps needed to derive a specific result, such as the sequence of steps a type checker would perform to type check a piece of code, or the sequence of steps a parser would perform to parse an input string. This checking technology is the result of ongoing research at Stanford into developing innovative tools for education, and we're excited to be the first course ever to make it available to students.
An optional course project is to write a complete compiler for COOL, the Classroom Object Oriented Language. COOL has the essential features of a realistic programming language, but is small and simple enough that it can be implemented in a few thousand lines of code. Students who choose to do the project can implement it in either C++ or Java. I hope you enjoy the course!
Why Study Compilers?
Everything that computers do is the result of some program, and all of the millions of programs in the world are written in one of the many thousands of programming languages that have been developed over the last 60 years. Designing and implementing a programming language turns out to be difficult; some of the best minds in computer science have thought about the problems involved and contributed beautiful and deep results. Learning something about compilers will show you the interplay of theory and practice in computer science, especially how powerful general ideas combined with engineering insight can lead to practical solutions to very hard problems. Knowing how a compiler works will also make you a better programmer and increase your ability to learn new programming languages quickly.
Steven Frank completed this course, spending 15 hours a week on it and found the course difficulty to be very hard.
Stanford's "CS1 Compilers" is less a course than a way of life -- thank goodness it's self-paced! The experience is like being asked to translate a novel into a sequence of foreign languages you're learning for the first time. The amount of work is prodigious....
Stanford's "CS1 Compilers" is less a course than a way of life -- thank goodness it's self-paced! The experience is like being asked to translate a novel into a sequence of foreign languages you're learning for the first time. The amount of work is prodigious. But this course deserves its reputation as the best there is on the subject.
The instructor, Alex Aiken, is fantastic. His articulate, conversational style is easy to follow and he assumes a minimal CS background, just C and exposure to assembly language. The course material is broadly organized around four key components of compiler design: syntax, parsing, semantics and type checking, and code generation. So let's talk about the languages you'll learn.
The language for which you'll be writing a compiler is COOL, the Classroom Object-Oriented Language written by Prof. Aiken. There's a tutorial, a reference manual, and some example programs, but ultimately this is a new C++/Java-like object-oriented programming language you'll have to learn. Then we're on to the computer science. Lexical analysis partitions a high-level program into its basic lexical components (keywords, numbers, etc.) using the languages of regular expressions and formal grammars. Regular expressions, in turn, are implemented computationally by finite automata, which have their own peculiar language. The first programming assignment, which has you writing a lexical analyzer, uses a framework called Flex or Jlex, depending on whether you're coding the assignments in C++ or Java (the only allowed choices) -- so that's another language. The next step, parsing, uses the language of context-free grammars, and here's where things get pretty theoretical -- think parse trees and Noam Chomsky. After you master the languages of top-down and bottom-up parsing, you can put them to work using a tool called bison (for C++) or CUP (for Java) -- so that's yet another language. Now we're into semantic analysis and type checking, which use the rigorous language of formal logic (think mathematical proofs) and abstract syntax trees. For the programming assignment, you'll write a semantic analyzer involving 1500-3000 lines of code. Code generation, the final stage of the compiler you'll write, requires a fairly deep dive into assembly language (the MIPS language, in particular), and the lectures lead you through the complexities of register and stack management. In the programming assignment, you write a code generator (another 1500-3000 lines of C++ or Java) that creates MIPS assembly code for the abstract syntax tree generated by your semantic analyzer. Now, all of these programming assignments are actually optional.
The mandatory assignments are six quizzes, which are difficult but allow unlimited attempts, and a midterm and final, which are killers and allow only one attempt at each question with no partial credit. But you'd be crazy not to at least attempt the programming assignments if you want to learn about compilers. This course is supremely well-taught and well-organized, and -- again -- mercifully self-paced. Just know what you're getting into and buckle up.
Kartik Kukreja completed this course.
The course discusses the major ideas used today in the implementation of programming language compilers, including lexical analysis, parsing, syntax-directed translation, abstract syntax trees, types and type checking, intermediate languages, dataflow analysis, program optimization, code generation, and runtime systems. It teaches how a program written in a high-level language designed for humans is systematically translated into a program written in low-level assembly more suited to machines. It also discusses how programming languages are designed, programming language semantics, and why there are so many different kinds of programming languages.
Anonymous completed this course.
Excellent course, not for beginners though. Assignments are a lot of work but in the end you get a pretty good understanding of the whole compilation process, assembly, and executables. Prof. Aiken's teaching is outstanding. Thank you!