Compilers and nuclear reactors

This summer I’m doing an internship at a small, research-y software shop called GrammaTech. They do a lot of research-based projects and have a static analysis tool called CodeSonar that is really quite nifty. I’ve been poking around the CodeSonar codebase as part of my work and I’m pretty impressed by the kind of things they do. Most of it is in C and I can easily say I’ve never seen C written this way before. They leverage C’s close-to-the-metal nature and the preprocessor to create some pretty powerful abstractions. It makes me painfully aware of how much I still have to learn. While marveling at what a deep knowledge of C can do I came across “Moron why C is not Assembly” by James Iry – an explanation of why C is not just syntactic sugar for assembly language.

As I learn more about programming in general and programming languages in particular, I’m starting to think that there is something of a Law of Conservation of Power, vaguely akin to the conservation of matter and energy (but not quite as absolute). For example, Iry talks about how C enforces the stack abstraction and hides any parallelization in the hardware (or in the generated code). By moving from the assembly world to the C world you’re trading one form of power for another – you obey the constraints of the stack and function calls get much simpler. But you lose the ability to fine tune using dedicated hardware instructions.

This is true as you continue exploring more languages – give up the looseness of C’s weak type system for something stricter and stronger (ML with it’s algebraic datatypes for example) and you can have the machine enforce invariants and guarantee certain properties of your code. You can perform easier and more elegant matches and actions on the type of your data. But you give up the flexibility that comes of raw data and fundamentally unstructured bits. If you choose strict immutability and pure functions (like Haskell or Clojure) you get to interact with your datatypes in a more mathematically precise form, you get to reap the benefits of concurrency without worrying about data corruption (to some extent). But you lose the ability to quickly stash some metadata into the corner of some variable data structure and pull it out at will.

If we start viewing different languages as tradeoffs in the power available to the programmer then a compiler becomes a very special tool – a power transformer, akin to a nuclear reactor (or a Hawking’s Knot if you don’t mind some futurism). A compiler, at its core, takes a stream of symbols (according to predefined syntactical rules) and transforms them into another stream of symbols (according to predefined semantic rules). Along the way, the compiler is responsible for enforcing the power transforms – ensuring that your C code doesn’t get to the underlying stack, that your Haskell code obeys the type constraints. If our languages are tools with which we build our universes, our compilers enforce the laws of physics (whatever laws they may be). The reason we’re computer scientists and not physicists is that we can create and change these laws at whim, instead of only studying and exploring them.

Just as we don’t want to be putting nuclear reactors in SUV’s there isn’t a “best” power tradeoff. Do you need to be really close to the metal on a resource constrained embedded system? Use C. Do you need a guarantee that a successful compilation will rule out a certain class of runtime errors? Use ML. If you need a fuel-efficient vehicle to take you to work everyday, get a Prius. If you need a self sufficient, water-borne weapons platform that only needs to be refueled every few decades and can rain down vengeance on your enemies should the time come, then invest in an aircraft carrier or a nuclear submarine. Don’t bring a knife to a gunfight.

There are two big elephants in the room: Turing-completeness and Lisp. All Turing complete languages are strictly equivalent in their computational power, but that misses the point of this discussion. Most programmers are not writing code for machines at all: they are writing programs for programmers (including themselves) to read, understand and use (and only incidentally for machines to execute; thank you, SICP). When you change the rules of the game to be not strict computational power but expressiveness and understandability to another human, this Conservation of Power thing becomes much more important. Choosing the correct set of tradeoffs and balances (and hence the correct language and related toolsets) becomes one that has far reaching impact on your team and project. Make the wrong choice and the Turing tarpit will swallow you alive and surrender your fossilized remains to a future advanced insectoid race.

Lisp is another matter entirely. The so-called “programmable programming language” has been used to build everything from operating systems to type systems which are Turing complete in themselves. As Manuel Simoni puts it, in the Lisp world there is no such thing as too much power. Lisp laughs flippantly at the Law Conservation of Power by placing in the programmer’s hands the enforcer of the Law – the compiler itself. By virtue of S-expressions and macros Lisp allows and encourages you to play God. The natural way to program in Lisp is to “write code that writes code” – creating your own minilanguages tailored to the task at hand. With great power comes great responsibility of course so the Lisp programmer must be particularly careful.

I haven’t explored Lisp as much as I’d like to and I’m only just starting to look into ML and Haskell. But as a career programmer (or some approximation thereof) I think it’s a good idea to routinely move between various points on the power spectrum. That’s not to imply that it’s a perfectly linear scale, but that’s a matter for another post. As my experience at GrammaTech is showing there are delights, wonders and challenges no matter where you decide to plant your feet.

Published by

Shrutarshi Basu

Programmer, writer and engineer, currently working out of Cornell University in Ithaca, New York.

One thought on “Compilers and nuclear reactors”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s