Inception and abstraction

I watched Inception with friends last Saturday. I really enjoyed it and thought it was really well made, though it’s certainly a complex movie which you need to pay attention to. Considering that most of the readers of The ByteBaker are computer savvy (and probably programmers) you’re going to like it (or hate it) very much because it touches on some of our core concepts: recursion, closures and abstraction. In this way, it’s not all that much different from the Matrix — different premise and plotline but a very similar feel.

Without dropping any spoilers here’s what you need to know about the movie for the rest of the article: it’s about people who go into other people’s dreams in order to steal their secrets. Pretty simple, right? The kicker is that it’s possible to dream inside another dream leading to all sorts of interesting situations and plot twists. Ok, so that’s not quite recursion since that would mean that the subject would be dreaming the same dream inside the dream (confused yet?). Come to think of it the dreams of Inception are more like closures.

So what are closures? Wikipedia tells us that:

In computer science, a closure is a first-class function with free variables that are bound in the lexical environment.

Perfectly understandable, right? No? Ok let’s translate. First and foremost, a closure is a function. But it’s not just any old run-of-the-mill function. A closure generally contains variables that are neither local variables nor arguments to that function. So what do those variables refer to? Their values come from outside the function, specifically the code block that surrounds the function. The wikipedia page on closures gives examples in a number of languages. Though closures aren’t necessarily a part of a standard curriculum they are extremely powerful constructs that can be used to implement a host of other programming language features (including control flow structures and object systems). Coming back to Inception, once you are inside a dream you can recall the world outside (though the real world seems like a dream so everything’s a bit fuzzy).

Closures in computer science (and dreams in Inception) are important because they are a prime example of abstraction. Functions are a powerful concept because they essentially let you create little worlds in which you can do stuff. You put something in a function and get something out. You don’t need to know or care about what’s going on inside the function (unless something goes wrong, but that’s a different matter entirely). Functions let you abstract away processes. Closures improve upon functions and let you abstract state. If you’re using a function that’s a closure, you don’t need to know about what it’s variables are bound to (except the ones you pass in) and you can’t see what data the closure can manipulate either.

By tucking away state, closures give us less to hold in our minds and make it easier to write code that’s clean and follows the Single Responsibility Principle (essentially, do one thing and do it well). Suppose you have a bunch of closures inside one larger function. Now magically you have sections of executable code that all operates on the same data and yet can do completely different things. They can also do all this without having to passing in a host of arguments every time (which reduces the chance for making mistakes). Whenever the closures need something, they just refer to their outer environment. Sound familiar? It should because I just described objects and methods. And that is the hallmark of a good abstraction — it lets you build up other abstractions on top.

Abstractions are in general a good thing. But unless you think through your abstractions, they can be bad. A leaky abstraction is one that doesn’t quite get it right. The underlying layers somehow “leak through” what should be the abstraction’s water tight boundaries. Joel Spolsky has a very good article on leaky abstraction that’s a must read if you want to learn more. And while we’re on the topic of abstraction — too much can be a bad thing. I wrote a Python program two summers ago to experiment with L-systems. Last summer I tried rewriting it such that everything in the system was the instance of some class. Everything was supposed to go through methods and abstraction boundaries. I never finished. This summer I toned it down a little and got a working version in about a week. Yes, this is classic second system effect, but it also shows that sometimes abstractions will just get in the way and force you to jump through hoops.

In conclusion: abstractions are good if used wisely. Closures are one such powerful abstraction. Dream safe.

Switch-case statement in Python

This post is part of the Powerful Python series where I talk about features of the Python language that make the programmer’s job easier. The Powerful Python page contains links to more articles as well as a list of future articles.

In the months since this was written, I have received a number of comments and learned more about programming languages in general. I put some of the new things I learned into a revised post on the topic which you might be interested in.

A switch-case statement is a useful programming language that lets you control the flow of the program based on the value of a variable or expression. In particular, if the variable or expression that you’re testing has a number of different of possible values, you could execute a block of code for each separate value. Here’s a simple example in C (courtesy of Wikipedia):

switch(n) {
  case 0:
    printf("You typed zero.\n");
    break;
  case 1:
  case 9:
    printf("n is a perfect square\n");
    break;
  case 2:
    printf("n is an even number\n");
  case 3:
  case 5:
  case 7:
    printf("n is a prime number\n");
    break;
  case 4:
    printf("n is a perfect square\n");
  case 6:
  case 8:
    printf("n is an even number\n");
    break;
  default:
    printf("Only single-digit numbers are allowed\n");
  break;
}

Based on the value of the variable n, a different message will show up on the standard output. The default case is executed when the value of the variable doesn’t match anything for which a case is defined. Notice that some cases have a break statement while others don’t. If a particular case block is executed and no break is found, then all the following case blocks will be executed until a break is found.

The switch-case statement comes in handy when you’re writing a language parser like I am now. For simple languages, like most programming languages, each token in a sample of the language can be followed only by a very limited number of possible tokens. My putting the options of possible following tokens in a switch case statement, you have an easy mechanism for performing different tasks based on what token is actually found. For example, if it’s a variable name, you check if its been declared before, if it’s an operator, perform the corresponding operation etc.

Unfortunately, my language of choice for the time being is Python, which doesn’t come with a typical switch-case statement. One simple substitute is using a string of if-else blocks, with each if condition being what would have been a matching case. Here is part of the above code in Python using if-else blocks.

if n == 0:
    print "You typed zero.\n"
elif n== 1 or n == 9 or n == 4:
    print "n is a perfect square\n"
elif n == 2:
    print "n is an even number\n"
elif  n== 3 or n == 5 or n == 7:
    print "n is a prime number\n"

It certainly works and should be pretty easy to work, but it’s not a very elegant solution. Especially if you have more than a handful of cases, you’re going to have a very long string if-else cases. Since each of the if conditions must actually be checked, you might run into performance issues if this is a vital part of your code.

The Pythonic solution is to make use of Python’s powerful dictionaries. Also known as associative arrays in some languages, Python’s dictionaries allow a simple one-to-one matching of a key and a value. When used as part of a switch case statement, the keys are what would normally trigger the case blocks. The interesting part is that the values in the dictionaries refer to functions that contain the code that would normally be inside the case blocks. Here’s the above code rewritten as a dictionary and functions:

options = {0 : zero,
                1 : sqr,
                4 : sqr,
                9 : sqr,
                2 : even,
                3 : prime,
                5 : prime,
                7 : prime,
}

def zero():
    print "You typed zero.\n"

def sqr():
    print "n is a perfect square\n"

def even():
    print "n is an even number\n"

def prime():
    print "n is a prime number\n"

Now that you have the switch case setup, you can actually use it by simply doing a dictionary lookup:

options[num]()

Thanks to the fact that Python functions are first class values, you can use the functions as the values of the dictionary and then call them via dictionary lookup.

The advantage of this method is that it is generally cleaner than a long line of if-elses and someone reading your code can simply ignore the functions they are not interested in. Performance-wise, the Python dictionary lookup will almost certainly be more efficient than any solution you can rig yourself (with perhaps the exception of custom C code, but if you can do that, why are you reading this?). However, except a penalty associated with calling and coming back from a function. In fact, it’s the function call that is both the strength and the weakness of this method. While it lets you cleanly segment your code, you also have a bunch of functions lying around. This isn’t a problem if you were going to use functions anyways, but they can be a mess if you have a lot of tiny ones lying around. If your functions sufficiently small, consider inlining them in the dictionary as lambda expressions.

As a final note, the above example doesn’t provide a default case in case nothing matches. You can make up for this by having the actual lookup be inside an if-else block, with the condition checking for the presence of the key in the dictionary (using the ‘<keyname> in <dictionary>’ idiom). But a more Pythonic way is to wrap it in a try/except block. If the option isn’t presence, a KeyError Exception will be raised which can then be caught and the default code executed. Be sure to check the Wikipedia entry and this blog post for more information.