Sunday Selection 2010-07-25

Reading

Emacs vs Vi is rooted in the love of Lisp — This is an older article I came across a few days which shows how the universal programming powers that define Lisp are at the root of the Emacs/Vi divide

Have generics killed Java in which the authors argues that generics have harmed Java and that static type checking is a dead-end

Media

Public Static Void — an excellent talk by Rob Pike that discusses language and evolution and why our languages mostly suck

Software

Firefox Alpha with TabCandy is an early test release that contains a very interesting new interaction interface for tabs which I think is a step forward. Now if only I could run it all from my keyboard. Be warned that no extensions will work.

Bonus: Here’s the TabCandy video

Joining the Creative Commons

Creative Commons License
The ByteBaker by Shrutarshi Basu is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Permissions beyond the scope of this license may be available at http://bytebaker.com/contact.

This is something I should have done a long long time ago. But as they say, better late than never. I received an email yesterday from Robert Sebescen at Kodknackaren.se offered to translate and host the posts from the Powerful Python series. I was going to say yes and then realized that though I didn’t have any sort of licensing information on The ByteBaker.

So today I’m changing that. All the content that I’ve written on the site is under the Creative Commons license as stated above. It basically means that you can redistribute my work and create derivative works from it (such as Sebescen’s translations) as long as you give me due credit. However, if you are not allowed to use the work for any commercial purposes (ie, you can’t sell it or charge people to see it). I’m not again people making money, I just don’t want them doing it off of my work without giving me a cut. I have contact info on this site, so if anyone’s interested, drop me a line and we’ll talk.

I do reverse the right to change the license in the future, however I can’t see myself doing that anytime in the near future. For readers, this isn’t really a change at all. But the great thing about computers and the Internet is that it promotes an incredibly diverse and productive remix culture. And I want to my part to give back to that remix culture that has given me so much cool stuff.

Revamping the ByteBaker series

Not too long ago I started writing series of posts on The ByteBaker. I started two of them: Powerful Python and Sunday Selections.

PythonPowerful Python was a series of posts about the Python programming languages and how its features make it easier for programmers to write code. As it stands now there are four posts in this series:

Python is the language that I’m most familiar with and have written the most code in. Over the last month or so I’ve been writing Python day in day out and really exercising my Python chops (as well as getting acquainted with features like generators and decorators).  Over the next few weeks I’m going to be writing more posts exploring Python and adding them to the Powerful Python series. If you regularly write code in Python or just have a passing interest, this is something you’re going to like.

The second series that I had was Sunday Selections. I try to post two to three times a week, but I didn’t want to leave the weekends completely bare. I also wanted to spend my weekends doing other things (preferably away from the computer). So I started a series where every Sunday I would post links (with brief intros) to interesting things that I had found the week before. I’ll admit that I haven’t been very stable with the post schedule, partly because I kept forgetting or losing what I had found and really didn’t want to go hunting around the intertubes for whatever it is that I liked.

Over the past few months I’ve become much better at holding onto things I find online. Using Diigo for bookmarks and Tumblr for “scrapbooking” the web I’ve been managing to keep a good record of all the wonderful stuff I’ve found (and there is a lot of it). So I’m bringing back Sunday Selections as well (starting this Sunday) so stay tuned for a steady flow of Internet-y goodness.

I’m really looking forward to writing series posts again. I feel like my writing can sometimes get either monotonous or spread all over the place without any focus. I’m hoping that the series (especially the Powerful Python series) will provide a good path for me to write articles that are coherent and progress along a definite line. Stay tuned.

Inception and abstraction

I watched Inception with friends last Saturday. I really enjoyed it and thought it was really well made, though it’s certainly a complex movie which you need to pay attention to. Considering that most of the readers of The ByteBaker are computer savvy (and probably programmers) you’re going to like it (or hate it) very much because it touches on some of our core concepts: recursion, closures and abstraction. In this way, it’s not all that much different from the Matrix — different premise and plotline but a very similar feel.

Without dropping any spoilers here’s what you need to know about the movie for the rest of the article: it’s about people who go into other people’s dreams in order to steal their secrets. Pretty simple, right? The kicker is that it’s possible to dream inside another dream leading to all sorts of interesting situations and plot twists. Ok, so that’s not quite recursion since that would mean that the subject would be dreaming the same dream inside the dream (confused yet?). Come to think of it the dreams of Inception are more like closures.

So what are closures? Wikipedia tells us that:

In computer science, a closure is a first-class function with free variables that are bound in the lexical environment.

Perfectly understandable, right? No? Ok let’s translate. First and foremost, a closure is a function. But it’s not just any old run-of-the-mill function. A closure generally contains variables that are neither local variables nor arguments to that function. So what do those variables refer to? Their values come from outside the function, specifically the code block that surrounds the function. The wikipedia page on closures gives examples in a number of languages. Though closures aren’t necessarily a part of a standard curriculum they are extremely powerful constructs that can be used to implement a host of other programming language features (including control flow structures and object systems). Coming back to Inception, once you are inside a dream you can recall the world outside (though the real world seems like a dream so everything’s a bit fuzzy).

Closures in computer science (and dreams in Inception) are important because they are a prime example of abstraction. Functions are a powerful concept because they essentially let you create little worlds in which you can do stuff. You put something in a function and get something out. You don’t need to know or care about what’s going on inside the function (unless something goes wrong, but that’s a different matter entirely). Functions let you abstract away processes. Closures improve upon functions and let you abstract state. If you’re using a function that’s a closure, you don’t need to know about what it’s variables are bound to (except the ones you pass in) and you can’t see what data the closure can manipulate either.

By tucking away state, closures give us less to hold in our minds and make it easier to write code that’s clean and follows the Single Responsibility Principle (essentially, do one thing and do it well). Suppose you have a bunch of closures inside one larger function. Now magically you have sections of executable code that all operates on the same data and yet can do completely different things. They can also do all this without having to passing in a host of arguments every time (which reduces the chance for making mistakes). Whenever the closures need something, they just refer to their outer environment. Sound familiar? It should because I just described objects and methods. And that is the hallmark of a good abstraction — it lets you build up other abstractions on top.

Abstractions are in general a good thing. But unless you think through your abstractions, they can be bad. A leaky abstraction is one that doesn’t quite get it right. The underlying layers somehow “leak through” what should be the abstraction’s water tight boundaries. Joel Spolsky has a very good article on leaky abstraction that’s a must read if you want to learn more. And while we’re on the topic of abstraction — too much can be a bad thing. I wrote a Python program two summers ago to experiment with L-systems. Last summer I tried rewriting it such that everything in the system was the instance of some class. Everything was supposed to go through methods and abstraction boundaries. I never finished. This summer I toned it down a little and got a working version in about a week. Yes, this is classic second system effect, but it also shows that sometimes abstractions will just get in the way and force you to jump through hoops.

In conclusion: abstractions are good if used wisely. Closures are one such powerful abstraction. Dream safe.

Release schedules and version numbers

I just finished a major rewrite and overhaul of my long-term research project and pushed it out to other students who are working with me on it. In the process of the redo I rewrote large parts of it to be simpler code and added a few important features. I also cleaned up the code organization (everything is neatly divided into directories instead being spread throughout the toplevel), added comments and rewrote the documentation to actually described what the program did and how to use it. But it wasn’t just a pure rewrite and refactoring. I added at least one important new feature, added a completely new user interaction mode and changed the code architecture to explicitly support multiple interfaces. But the thing is that even though I’ve “shipped” it, it’s still not quite done.

There are significant parts missing. The unit testing is very, very scant. There is almost no error handling. The previous version had a GUI which I need to port to the new API/architecture. I also want to write one more interaction mode as a proof of concept that it can support multiple, different modes. The documentation needs to be converted to HTML mode and there are some utility functions that would be helpful to have. In short, there’s a lot that needs to be done. So my question is, what version of my code is this?

I started a rewrite of this last  summer as well but never finished — a casualty classic second system effect. For a while I considered calling this version 3.0 counting the unfinished copy as 2.0. But I decided it was rather silly and so I’ve actually called it 2.0. Though it’s certainly a major major change from the last version, in some ways it’s still broken and unfinished. Is it a beta? Or a release candidate? I suppose that’s a better description. Except the additions that I want to make are more than moving it from a beta to a full release. The GUI would definitely be a point release.

In many ways the debate is purely academic and kinda pointless. As I’ve written before, software is always beta. However, releasing major and minor “versions” of software is a popular activity. In some ways it’s helpful to the user. You can tell when something’s changed significantly and when you need to upgrade. In an age where you had to physically sell software, that was a good thing to know. However, the rise of web-based software has changed that to a large extent. If you’ve been using Gmail for a while, you’ll know that it has a history of small, regular atomic improvements over time. And it’s not just Gmail, it’s most of Google’s online services. Sometimes there are major facelifts (like Google Reader a few years ago) but by and large this gradual improvement works well. Google Chrome also uses this model. Chrome is officially past version 5 now. But thanks to its built in auto update mechanism you don’t need to care (and I suspect most people don’t). Rolling releases are clearly acceptable and may just be the way software updates are going to go in the future. Of course, if you’re charging for your code you’re going to have some sort of paywall, so no, manual software updates probably won’t go away forever.

Coming back to my original question, what version did I just release? 2.0? 2.0 beta 1? 1.9.5? Honestly I don’t really care. Part of my disinterest stems from the fact that Git makes branching and merging so easy. It’s hard to care about version numbers and releases when your code is in the hands of a system that makes it so easy to spin off feature branches and then merge them back in when they’re ready. If I worked in a fully Git based team I’d just have everyone running daily merges so that everyone just automatically got the new features. In that case I wouldn’t have waited to release. The big new feature would have been pushed a week ago, the reorganization and cleanup after that and then the documentation yesterday. I’d also be sending out the later updates and addition one at a time once they were done. Everyone else uses SVN, there might still be a way to do it.

In conclusion: rolling releases are awesome. Users don’t have to worry about manually updating and automagically get new features when they’re available. Developers using a good version control system can be up-to-date with everyone else’s code. This is especially important if you’re writing developer tools (which I am): the faster and easier you can get your updates to the people making the end product the faster the end product gets developed.

PS. If you’re wondering what exactly it is I’m making, more on that later. There’s a possibility of a source release after I talk to my professor about it.