The Code is Not the Point

There’s this meme in the programming community of thinking of our code as art. This is not a new phenomenon – it dates back at least to The Art of Computer Programming in 1968. More recently we have Paul Graham to thank for drawing the comparison between Hackers and Painters. With the rise of languages like Ruby and Processing and personalities like _why the lucky stiff programming has been gaining a reputation as an art form and a source of creative joy. And it should be. I love programming, it makes me feel good and I feel much better on days I’ve written code and produced something than on days I haven’t. However, I think the “code is art” or “programmers are artists” meme can be misleading because the code is not the point.

Miyamoto Musashi was a medieval Japanese swordsman and Ronin, widely regarded as the best swordsman of all time. He was also the author of a book titled “The Book of Five Rings” – a book on swordsmanship, strategy, tactics and philosophy that is still relevant today. There is one passage in the book that is relevant to our discussion:

The primary thing when you take a sword in your hands is your intention to cut the enemy, whatever the means. Whenever you parry, hit, spring, strike or touch the enemy’s cutting sword, you must cut the enemy in the same movement. It is essential to attain this. If you think only of hitting, springing, striking or touching the enemy, you will not be able actually to cut him.

The primary thing when you take an editor in your hands is your intention to solve the problem, whatever the means. Whether you write a script, a unit test, a function, or a library, you must move towards the solution in the same movement. It is essential to attain this. If you think only of scripts, tests, functions or libraries, you will not be able to actually solve the problem.

Because the code and it’s surrounding artifacts, including your sense of beauty, is not the point. It does not matter how much code you’ve written, what your test coverage is, how much you’ve learned in the process, if you haven’t not solved the problem (or solved it inadequately, or solved the wrong problem). Anything that does not bring you closer to your solution is suspect.

Does this mean that clean code, good comments, test coverage are not necessary or important? Of course not. If your code is not well-written and clear, are you sure you’ve solved the problem? If you have no tests, are you sure you’ve solved all aspects of the problem? If you have no documentation, how will others use and improve on your solution?

Does this mean that your code is not art? Does this you mean you should not carry an artists’ sense of elegance and aesthetics? Of course not. By all means take pride in your work. Make it a point of honor that others can understand your code without sending a dozen emails. Please aim for the solution that is not just correct, but also elegant, concise and efficient.

But keep in mind that the code is not the point. Beauty, elegance and pride are no substitutes for correctness. We are scientists and engineers first, artists second. If the theory does not fit the facts, if the code does not solve the problem, it must be discarded no matter how beautiful it is.

I don’t think of myself as an artist any more. My code is not art. I take pride in my work but it is the pride of an engineer. I want my code to have more in common with a Rolls Royce engine than it does with Sunflowers. I try to do the cleanest, most elegant job I can. But whenever I write code, the intention is to cut the enemy, whatever the means.

Python as a glue language

I’ve spent the better part of the past few weeks redoing much of the architecture for my main research project. As part of a programming languages research group I also have frequent discussions on the relative merits of various languages. Personally I’ve always liked Python and used it extensively for both professional and personal projects. I’ve used Python for both standalone programs and for tying other programs together. My roommate likes Bash for his scripting needs but I think Python is a better glue language.

My scripting days started with Perl about 6 years ago but I quickly gave up Perl in favor of Python. I don’t entirely remember why, but I do remember getting the distinct impression that Python was much cleaner (and all-round nicer) than Perl. Python has a lot of things going for it as a scripting language – a large “batteries included” standard library, lots of handy string functions for data mangling and great interfaces to the underlying OS.

Python also has decent features for being a general purpose language. It has a good implementation of object-orientation and classes (though the bolts are showing), first class functions, an acceptable module system and a clean, if somewhat straitjacketed syntax. There’s a thriving ecosystem with a centralized repository and a wide variety of libraries. I wish there were optional static types and declarative data types, but I guess you can’t have everything in life.

Where Python shines is at the intersection of quick scripting and full-fledged applications. I’ve found that it’s delightfully easy to go from a bunch of scripts lashed together to a more cohesive application or framework.

As an example I moved my research infrastructure from a bunch of Python and shell scripts to a framework for interacting with virtual machines and virtual networks. We’re using a network virtualizer called Mininet which is built in Python. Mininet is a well engineered piece of research infrastructure with a clean and Pythonic interface as well as understandable internals. Previously I would start by writing a Python script to instantiate a Mininet virtual network. Then I would run a bunch of scripts by hand to start up virtual machines connected to said network. These scripts would use the standard Linux tools to configure virtual network devices and start Qemu virtual machines. There were three different scripts each of which took a bunch of different parameters. Getting a full setup going involved a good amount of jumping around terminals and typing commands in the right order. Not very tedious, but enough to get annoying after a while. And besides I wouldn’t be much of a programmer if I wasn’t automating as much as possible.

So I went about unifying all this scripting into a single Python framework. I subclassed some of the Mininet classes so that I could get rid of boilerplate involved in setting up a network. I wrapped the shell scripts in a thin layer of Python so I could run them programmatically. I could have replaced the shell scripts with Python equivalents directly but there was no pressing need to do that. Finally I used Python’s dictionaries to configure the VMs declaratively. While I would have liked algebraic data types and a strong type system, I hand-rolled a verifier without much difficulty. OCaml and Haskell have definitely spoiled me.

How is this beneficial? Besides just the pure automation we now have a programmable, object-oriented interface to our deployment setup. The whole shebang – networks, VMs and test programs – can be set up and run from a single Python script. Since I extended Mininet and followed its conventions anyone familiar with Mininet can get started using our setup quickly. Instead of having configurations floating around in different Python files and shell scripts it’s in one place making it easy to change and remain consistent. By layering over and isolating the command-line interface to Qemu we can potentially move to other virtualizers (like VirtualBox) without massive changes to setup scripts. There’s less boilerplate, fewer little details and fewer opaque incantations that need to be uttered in just the right order. All in all, it’s a much better engineered system.

Though these were all pretty significant changes it took me less than a week to get everything done. This includes walking a teammate through both the old and new versions and troubleshooting. Using Python made the transition easier because a lot of the script code and boilerplate could be tucked into the new classes and methods with a few modifications. Most of the time was spent in figuring out what the interfaces should look like and how they should be integrated.

In conclusion, Python is a great glue language. It’s easy to get up and running with quick scripts that tie together existing programs and do some data mangling. But when your needs grow beyond scripts you can build a well-structured program or library without rewriting from scratch. In particular you can reuse large parts of the script code and spend time on the design and organization of your new applcation. On a related note, this is also one of the reasons why Python is a great beginner’s language. It’s easy to start off with small scripts that do routine tasks or work with multimedia and then move on to writing full-fledged programs and learning proper computer science concepts.

As a disclaimer, I haven’t worked with Ruby or Perl enough to make a similar claim. If Rubyists or Perl hackers would like to share similar experiences I’d love to hear.

Sunday Selection 2012-06-17

Around the Web

The Essential Psychopathology of Creativity

Andrea Kuszewski is one of my favorite writers on creativity, in large part because she references actual scientific research as opposed to blog posts and random opinions on the Internet. In this article she looks at the link between creativity and apparently detrimental psychological traits.

The care and feeding of software engineers

It’s a bit disappointing that despite how much of modern society and businesses depend on software engineers managers still need articles like this to tell them how to manage engineers. The only company I worked for did not have non-technical managers so I haven’t seen this firsthand but I’m guessing I might be the exception.

How Yahoo killed Flickr and lost the Internet

The computer industry has seen more than its fair share of companies rise and fall. While Yahoo is certainly not dead yet, it’s definitely not in its heyday. I’ve never been a big Yahoo or Flickr user, but there are lessons worth learning in this story.

Books

No videos this week because I’ve been trying to avoid videos and TV and focus on reading more books.

Accelerando

I first read Accelerando a few months and I’ve been re-reading it over the last few days. It paints an interesting view of the evolution of the human race in an era of intelligent machines and accelerating technological change. While the future it shows can be terrifyingly alien, I personally also find it very interesting and hopeful.

Still Alive

Rumors of my demise are greatly exaggerated. To be honest, I always feel a tinge of guilt and regret whenever I write a post along the lines of “I’m alive” or “I’m back”. Generally it means that I’ve spent the preceding few weeks in some form of pseudo-business, done some work, probably watched far too much TV and generally spent more time consuming than creating. Oh well, what’s done is done.

As I said, I am very much still alive (not quite kicking, but I’m working on that). It’s summer which means it’s warm outside and so I’m increasingly inclined to spend time inside. I don’t like the heat very much. While I’ve been gone I’ve been thinking about a number of things of varying levels of importance to me (and to you, my dear readers). So here’s a quick brain dump, in no particular order.

It’s been a while since I learned a new programming language. I’ve been using C and Python extensively and while I like good systems and scripting languages as much as the next hacker, it’s a bit like eating only whole wheat bread with Nutella. Not that there’s anything wrong with whole wheat bread or nutella, but I am craving a bit more variety. I’ve been looking at more exotic sources of nutrition, mainly Haskell and Clojure (nothing like a good Lisp to spice things up). Real World Haskell has been sitting idle on my desk for far too long – I stalled at chapter 4 and it’s about time I picked it up again. I’m considering grabbing a copy of Joy of Clojure too.

While writing code is certainly fun, I do missing writing words. And no, I don’t mean words meant for publication in a scientific journal (see whole wheat bread and Nutella above). In particular, I’ve been wanting to tell some good stories. I’ve always loved books and television and movies but it took some binge-watching of Doctor Who to help me realize that what I really like is a good story. And I like telling them as much as I like hearing (or seeing, or reading) them. It’s been entirely far too long since I wrote any sizeable amount of fiction and I can feel my storytelling muscles atrophying. If I put it off for much longer all I’m going to be capable of is Michael Bay-esque explosive blockbusters (though that might nicely supplement my starving grad student’s salary). I’ve been thinking about doing NaNoWriMo this year. NaNoWriMo sounds like a writing marathon and like all sporting events, training helps. I’ve always considered my blog to be training for non-fiction writing (and it’s helped) so maybe a similar fiction-writing training routine might help. Speaking of blogging, I miss it too. Writing really is a great way to sort out my ideas and think aloud (so to speak). So more of that too.

In other news, this blog runs off WordPress.com and I pay for mapping the Bytebaker.com domain to their servers. Apparently this will expire in about two weeks. I’ve been barking on and off about the need for next-generation content management systems that allow for more than a blog/static-site dichotomy. Given that it’s summer and I have a solid deadline this might be a good time to bite the bullet and roll my own. I did a half-hearted attempt a few years ago and round two has been long overdue. I’m still working out just what I want from this new system, but I have some cool ideas (mostly culled from other places). More on that later.

On a mostly unrelated note I tried out a new coworking space a few blocks from my place. It’s called the Popshop and seems to be run by a bunch of Cornell seniors. It’s pretty decent, they have lots of markers, whiteboards, air conditioning, a couch and a 3D printer. But the chairs suck so I may not be back too often. I’m also more in favor of quiet and isolated working spaces. I do spend most days working in an open plan setup, but we each have a lot of space to ourselves and there isn’t a whole lot of social interaction. I suppose coworking spaces are great if you’re looking for people to bounce ideas off but I prefer isolation when it comes to actually pumping out the code (or the words).

To recap: I’m alive. I’m looking forward to lots more hacking and writing in the near future. Coworking might be cool, but I’ll stick to my office for now. See you all tomorrow.

Grit for Programmers

It turns out that the best indicator of success isn’t IQ or natural talent or how well off you were at birth. Rather it’s something called grit – the perseverance and passion for long-term goals. Grit requires a clear goal, self-confidence and a careful balance between stubborness and flexibility. For the last few months I’ve been living one of the most productive (and most challenging) times of my life. I’ve been building a system that has more parts, does more things and is much larger than just about anything I’ve built before. It’s been challenging and rewarding work and I couldn’t have done it without lots of support from great mentors. As I’ve stumbled, fallen down, hit brick walls, picked myself up and kept going I’ve been wondering – does grit apply equally to programmers and success in building good software?

Programming culture is generally synonymous with hard work and long hours — death marches, all-nighters, 80 hour work weeks, we do them all. But we’re talking about grit here, not masochism. Grit isn’t strictly equal to working obscenely hard, long hours. Part of the problem with thinking about grit in relation to programming is defining what success means for a programmer. Is your definition of success simply finding a working solution? Does it mean finding the most efficient solution? Are you successful if you cover every single edge case or is it enough to just take care of the most common ones? Is your program really better if it handles everything you could throw at it or should you handle core uses cases well and fail gracefully on the others? Part of the problem of coming up with a good solution is asking the right question. This is especially true of building software. However merely coming up with the right question requires a certain amount of grit. We need the patience to look beyond the obvious problems and solutions and ask the hard questions.

So now we’ve found the right question and defined bounds on the possible solutions. What next? How does grit help with the actual act of writing code and building stuff? Programming is not easy. It can be fun and exciting and uplifting, but sometimes it is downright hard and depressing. Sometimes we spend hours sifting through possible solutions before hitting upon the appopriate one. Sometimes we spend several intimate hours with a debugger tracking down pointer bugs before finding that one variable we forgot to initialize. Being tenacious and persistent in the face of seemingly unrelenting roadblocks is not an added benefit for a programmer – it is a bare necessity. When it comes down to the act of sitting down, writing and debugging code grit is not optional. Without it not only can we not be good programmers, we can’t even be an average ones.

But if our goal is to be a good (maybe even great) programmer, then grit will continue to help. One of the qualities of good programmers is that they get a lot of stuff done. In particular they do a lot that isn’t strictly their job. This includes fixing and extending their tools and improving core infrastructure. They do this even if they aren’t in charge of infrastructure because they realize that their code depends on what’s underneath. Grit is the difference between waiting for someone else to fix the annoying bug in the library that you depend on and diving in and fixing it ourselves. When Steve Yegge talks about the difference between “superhumanly godlike” and “smart”, grit is a part of what he’s talking about. Not that there’s anything wrong with being smart, but it might not be enough. Of course to cultivate that level of grit we need to cultivate a good deal of courage. Diving into someone else’s code and fixing it can be a daunting task but it’s one that has to be mastered.

While I’ve always liked programming it’s taken me a long time to understand the importance of grit. When you do something because you like it (mostly) it’s tempting to stay away from the parts that are painful and hard. For a long time I avoided writing large programs because I was afraid of all the complexity that was involved. I was afraid of becoming familiar with complex algorithms because I was afraid of the possibility that I’d get it wrong. I understand now that I can’t become a good programmer if I don’t push myself to do the things that I consider hard and dislike. I need to have the grit to handle large complex problems and spend the time to understand and apply advanced algorithms. The good news is that just like perseverance and discipline, grit can be trained and improved. I’m no longer as afraid to dive into unknown codebases as I was a few months ago. I now find it much easier to hold complex code paths in my head. I’m certainly far, far away from being superhuman, but I try to suck a little less every day.

Predicting Human Intent

Readability is one of my favorite web services. Readability is well designed, carefully made, unobtrusive and they’re not trying to wring me of personal information to datamine and sell at every turn (at least I don’t think they are). Their web service gives you browser plugins that strip out everything but the content of a page and present it in a clean, crisp format. If you sign up for an account you can make use of their “Reading List” to store pages for later reading. Recently they released beautiful iOS and Android apps that sync with your reading list allowing you to stock up on things to read at your computer and read them on the move.

Though Readability is a great service and they have great supporting apps they have one flaw. However, it’s not entirely their fault: I think it’s a side-effect of the difficulty of predicting human intent. The problem has to do with Readability’s reading list. When you install the browser plugins you get three buttons: a “Read Now” button, a “Read Later” button and a “Send to Kindle” button. The “Send to Kindle” button formats and sends the web page you’re on to your Kindle (assuming you’ve set up and connected your Kindle properly). The “Read Later” button saves the page you’re on to your Reading List which can be synced to your iOS or Android devices. The “Read Now” button will do the Readability formatting on your current page and show you the cleaned up version right then for you to continue reading.

That’s all great. However, the “Read Now” button also drops the page you just converted into your Reading List. This is great if you start reading a long article and then have to leave your machine. You can come back to it later, on another browser or another device entirely. But what happens if you finish reading the article right there and then? The article still ends up in your reading list. That would be fine if the list was simply a history of things you’ve read. However the Reading List is also a list of things you’re going to read. So the Reading List now contains things I’m going to read, things I’ve read and things I might want to read again. I think this problem stems from the fact that Readability started as a formatter and added read-later functionality unlike services like Instapaper which are designed for savings articles for later.

How can we differentiate between all these types of articles? Readability provides the ability to “Archive” and “Favorite” articles. Once I’m done with reading an article I Favorite it if I’m going to read it again and Archive it otherwise. But could Readability do this for me? Could Readability somehow figure out what I want to happen to the article? The simplest solution would be to archive whatever I Read Now and only add to the Reading List ones that I mark to Read Later. However that means that if I start reading something and then have to leave it ends up in the Archive where I might never look at it again. Could Readability be a little smarter? One heuristic would be to check where I am in the article. By default an article is always added to the Reading List as it is now. But when I scroll to the bottom Readability takes that to mean that I’m done reading and moves it into the Archive. If I liked it and wanted to come back to it I manually mark it as a Favorite. (I don’t expect Readability to be that clever. Yet.)

Without having actually tested the solutions, I can’t say how well they would work. There are certainly edge cases: what if I scroll down to read a footnote and then scroll back up to read the rest of the article? What if I get to the end and then go back to re-read a particular section? What if I quickly skim through an article to get to the end and want to come back later to read it in more depth? I think there’s no clear answer because fundamentally we’re trying to have Readability “guess” what we’re trying to do without giving an unambiguous signal. Sure, all of them could be solved with a few manual interactions. But the whole point of having advanced software is so that I don’t have to tell my computer what to do in excruciating detail.

Like I said at the beginning I don’t think that Readability is necessarily at fault for how their service works. Any attempt to automatically manage the Reading List would require making some assumptions as to what it is the user wants to do. Even if those assumptions are right most of the time, there will almost certainly be times when they’re wrong. We are, after all, dealing with people here, and people aren’t perfectly predictable agents. If they were, human computer interaction and economics would both be very different fields.

Predicting human intent is a hard problem. Ultimately, some amount of direct intervention might be inevitable. While Readability is meant to be a product it would be interesting to see researchers using it (or similar services) for doing research with real users about how our software can make choices for us in a way that closely reflects what we would have done ourselves. Unlike some people I don’t want my software to do less and have fewer features. I want it to do more so that I can concentrate on more important things. Like saving the world.

Etudes for programming

From Wikipedia:

An étude (a French word meaning study, French pronunciation: [eˈtyd], English pronunciation: / ˈeɪtjuːd /) is an instrumental musical composition, most commonly of considerable difficulty, usually designed to provide practice material for perfecting a particular technical skill.

I noticed today that Michael Fogus (one of the authors of Joy of Clojure) has a number of Github repos with names such as etude-ocaml and etude-syntax. I also realized this week that I’m a pretty slow programmer. I’ve been getting better over the years but I’m still slow, especially if there’s a good amount of API design involved. While I think that writing lots of code will make me faster over time, I do wish there was a more structured, focused approach.

In general, I wish there was more by the way of études for programming — problems and exercises of considerable difficulty designed to provide practice material for a particular (set of) skills. There are of course great textbooks for programming and computer science and those books have good exercises (I particularly like SICP and the K&R C book), however in most of those cases the point is to teach first and practice second. What I’d like to see is the reverse – assume that the reader already knows about functional programming or the C language but needs to “level up”, so to speak. The exercises would be harder and more in number but would also cover a broad area in terms of application of the concepts involved.

This is related to what I’ve written earlier in terms of deliberate practice for programmers. That post talks about “level up” lists – a list of programs to make that help explore the different areas of computer science and help you gain experience and hence “level up” as a developer. On the other hand études would focus on depth rather than breadth – each one would focus on a small technique or technology and fully explore that area. Together a continuous habit of working on études and doing level-up projects would give programmers a steady stream of deliberate practice exercises to work on.

The question is, where are we to find these études? I’m not sure if there are programming books out there that fit that description. If there are, I’d love to here about them. But in the meantime I’ve found an acceptable alternative — homework and assignments for college level courses. This semester I’m the TA for a course on functional programming and throughout the semester we have a set of 6 assignments for students to do. Each of them have about 3 to 4 problems (each with multiple parts) that tackle a small area of functional programming. I think exercises like this are great material for études. I’m currently working through the exercises at the same time as the students (other TAs are making them). Even though I’m already familiar with most of the material it’s been a good learning and great practice for me. I can’t really measure if I’m improving (apart from running my solutions through the test harness) but it’s more direct and practice in functional programming that I’ve ever had.

I’ll be done with this particular étude in a few months. I don’t think I’ll be releasing the code since the problems often get reused. However I do think there will be lot more where those came from. There are lots of college courses with website out there and there’s lots to learn. I’ll probably try compilers next. All that being said, it would be great to see some curation and collection. With Amazon’s Kindle Shorts and the growing interest in short, self-published books putting together a regular series of études might be a pretty lucrative endeavor.