Python as a glue language

I’ve spent the better part of the past few weeks redoing much of the architecture for my main research project. As part of a programming languages research group I also have frequent discussions on the relative merits of various languages. Personally I’ve always liked Python and used it extensively for both professional and personal projects. I’ve used Python for both standalone programs and for tying other programs together. My roommate likes Bash for his scripting needs but I think Python is a better glue language.

My scripting days started with Perl about 6 years ago but I quickly gave up Perl in favor of Python. I don’t entirely remember why, but I do remember getting the distinct impression that Python was much cleaner (and all-round nicer) than Perl. Python has a lot of things going for it as a scripting language – a large “batteries included” standard library, lots of handy string functions for data mangling and great interfaces to the underlying OS.

Python also has decent features for being a general purpose language. It has a good implementation of object-orientation and classes (though the bolts are showing), first class functions, an acceptable module system and a clean, if somewhat straitjacketed syntax. There’s a thriving ecosystem with a centralized repository and a wide variety of libraries. I wish there were optional static types and declarative data types, but I guess you can’t have everything in life.

Where Python shines is at the intersection of quick scripting and full-fledged applications. I’ve found that it’s delightfully easy to go from a bunch of scripts lashed together to a more cohesive application or framework.

As an example I moved my research infrastructure from a bunch of Python and shell scripts to a framework for interacting with virtual machines and virtual networks. We’re using a network virtualizer called Mininet which is built in Python. Mininet is a well engineered piece of research infrastructure with a clean and Pythonic interface as well as understandable internals. Previously I would start by writing a Python script to instantiate a Mininet virtual network. Then I would run a bunch of scripts by hand to start up virtual machines connected to said network. These scripts would use the standard Linux tools to configure virtual network devices and start Qemu virtual machines. There were three different scripts each of which took a bunch of different parameters. Getting a full setup going involved a good amount of jumping around terminals and typing commands in the right order. Not very tedious, but enough to get annoying after a while. And besides I wouldn’t be much of a programmer if I wasn’t automating as much as possible.

So I went about unifying all this scripting into a single Python framework. I subclassed some of the Mininet classes so that I could get rid of boilerplate involved in setting up a network. I wrapped the shell scripts in a thin layer of Python so I could run them programmatically. I could have replaced the shell scripts with Python equivalents directly but there was no pressing need to do that. Finally I used Python’s dictionaries to configure the VMs declaratively. While I would have liked algebraic data types and a strong type system, I hand-rolled a verifier without much difficulty. OCaml and Haskell have definitely spoiled me.

How is this beneficial? Besides just the pure automation we now have a programmable, object-oriented interface to our deployment setup. The whole shebang – networks, VMs and test programs – can be set up and run from a single Python script. Since I extended Mininet and followed its conventions anyone familiar with Mininet can get started using our setup quickly. Instead of having configurations floating around in different Python files and shell scripts it’s in one place making it easy to change and remain consistent. By layering over and isolating the command-line interface to Qemu we can potentially move to other virtualizers (like VirtualBox) without massive changes to setup scripts. There’s less boilerplate, fewer little details and fewer opaque incantations that need to be uttered in just the right order. All in all, it’s a much better engineered system.

Though these were all pretty significant changes it took me less than a week to get everything done. This includes walking a teammate through both the old and new versions and troubleshooting. Using Python made the transition easier because a lot of the script code and boilerplate could be tucked into the new classes and methods with a few modifications. Most of the time was spent in figuring out what the interfaces should look like and how they should be integrated.

In conclusion, Python is a great glue language. It’s easy to get up and running with quick scripts that tie together existing programs and do some data mangling. But when your needs grow beyond scripts you can build a well-structured program or library without rewriting from scratch. In particular you can reuse large parts of the script code and spend time on the design and organization of your new applcation. On a related note, this is also one of the reasons why Python is a great beginner’s language. It’s easy to start off with small scripts that do routine tasks or work with multimedia and then move on to writing full-fledged programs and learning proper computer science concepts.

As a disclaimer, I haven’t worked with Ruby or Perl enough to make a similar claim. If Rubyists or Perl hackers would like to share similar experiences I’d love to hear.

Advertisements

Portable Ubuntu and dual monitors

I love dual monitors. Roughly half of the labs I spend my time in have dual monitors. The others don’t and hence I try not to spend much time in those. Unfortunately one of those single monitor labs is the only computer science Linux lab that we have, so by necessity I actually do need to spend a considerable amount of time there. And whenever I’m there I miss not having a second monitor.

If you’re not someone who hasn’t used dual monitors for a while, then it can be somewhat hard to understand how much easier two monitors make your life. Two monitors provide a very natural division of information that you need on your screen. One monitor contains reference information, this is stuff that you’re always looking at, but that you’re not actively interacting with. The other monitor contains whatever things that you are actively interacting with. For me as a programmer, one monitor generally contains API references in a browser (Chrome on Windows, Firefox on everything else). The other monitor contains my editor/IDE. Unfortunately I do most of my programming in the Linux lab which are all single monitor machines or on my laptop, which I rarely hook up to an external monitor.

There are a  lot of Windows dual-monitor machines available in other labs, but the only thing I like about Windows anymore is Google Chrome. Our Windows machines aren’t locked down, so students are allowed to install software as long as it isn’t something dangerous. I was considering installing some sort of X server on some of the machines. But I generally move about machines quite a bit and so I don’t want to be installing X servers on every machine I’m on.

My next thought was carrying around a bootable Linux USB drive and running off that. I was seriously considering doing that when I came across an interesting SourceForge project via Reddit which uses virtual machine technology to let you run Ubuntu like an application right in Windows. And yes, that was the answer to my problems. Last evening I downloaded the Portable Ubuntu image to a  lab machine and gave it a test run before moving it onto my 4GB USB drive.

My experience has been mostly positive so far. The Ubuntu installation is somewhat out of date (it’s the 8.04 version of it). But that really isn’t a problem for me. In fact, as it turns out, I haven’t really been using it as a full fledged Linux distribution. For the most part I use it as an interface to my college’s powerful Linux clusters.  I have pulled my personal Git repository to it, but for the most part I think I will be working directly off my college’s machines. The greatest benefit is that I can run normal Windows apps right alongside it. This means that I can have a bunch of terminals and Emacs open while at the same time having Google Chrome and some other Windows-specific software I need. The system really comes into its own with multiple monitors. It’s useful to think of one monitor as a Linux screen and the other as a Windows screen. I’ve only been using it for a day, but I’ve already found it a natural way to work.

As a final note, I would like to put out a little disclaimer: I’ve only used this on powerful machines. The lab computers are 3GHz Core 2 Duo machines with 3.5GB of RAM. Performance is quite acceptable and whatever is happening on the linux side doesn’t seem to be affect the Windows side at all. However, on a machine that is much slower or has significantly less RAM, things might be a good deal slower. If you’re stuck using a Windows machine but would rather use Linux, this is a great way to go if you have a fast enough machine.