Please pay for your software

I’ve been a committed user of free and open source software (FOSS) for the better part of the last three years. In that time I haven’t paid for a single piece of software. In fact, I honestly can’t remember the last time that I actually have paid for software. I am sometimes amazed by the amount of fully functioning software that I get for the great price of free. However, at this time of year, when festive cheer is at its height, I think it’s about time we broke out our wallets and actually paid for the ‘free’ software that we do use.

Being the starving college student that I am, I to make it clear that I do not consider spending money on software lightly. I wouldn’t ever fork over the hundreds of dollars needed for Windows or Office or any of the Adobe products. At the same time I am a software developer and though I love creating software, I would like to be able to make a better than decent living from it someday. Being a computer science student, I also understand full well that writing good software is hard, it’s mentally draining and there’s only so much caffeine-powered coding you can do in a day before your productivity goes south. Like spending money on software, it’s a commitment not to be taken lightly. And you can’t conceivably invest so much effort into something so demanding unless you truly love what you’re doing. All the software I’m using now, the Linux kernel and the GNU utilities, the Emacs editor and the Firefox browser, collectively represent thousands, maybe even millions of man hours of programming time. All the programmers that have invested their time, effort (and probably money) into creating these great tools for everyone to use surely deserve compensation. At the very least, we have a commitment to help sustain the infrastructure that supports the FOSS movement. Server space and bandwidth aren’t exactly free and with the number of downloads continually growing, there will be increasing costs to pay to keep the movement going.

If you’re immediately put off by the thought of paying for software, don’t be, it’s not as painful as it sounds. The good news is that with free software you get to set the price. Even for a student like me, $50 or $100 a year is a very reasonable price to pay. If you’re a professional developer using free tools to write code for pay, consider a monthly contribution of $10 – $20 alternating between your commonly used FOSS projects. The bargain you’re getting is amazing. If a corporation actually charged for all the tools out there, we’d easily be having to fork over hundreds of dollars a years.

After reading two posts on Coding Horror, I’ve come to the conclusion that it is only logical to donate a reasonable amount to help support the free software we use. Jeff Atwood’s posts focus on paying for cheap, high quality software as opposed to pirating it. His reasoning is that if you don’t pay the bare minimum there soon won’t be anyone interested in making good software at low prices. The exact same logic holds for FOSS. Sure, there will always be people like Linus Torvalds releasing their private projects for others to use, and there will always be people like Richard Stallman driven by pure ideology. But in the end, unless people like you and me decide to make a contribution, hobby projects will stay hobby projects and not reach the levels of quality and reliability that we’ve come to expect. So the next time you’re out buying presents for your loved ones take a minute to think about all the people who have made your life as a programmer easier (and maybe helped you earn the money you’re now spending). And right now, when you finish reading this, pull out your credit card and donate some money to the open source project who’s software you value the most.

This year I’ll be donating $50 to Arch Linux. It’s the distro I’ve been using for the last two years and I appreciate both the technical quality and the lively community.


Books for intermediate computer science students

So you’ve survived the first few programming courses, you know why bubble sort’s a bad idea and you can write tree traversal algorithms in your sleep. The question that’s eating you as the new year looms is: what’s next? There are a proliferation of books out there meant for beginning programmers and quite a few for the experts, but in my experience there are fewer resources for people who are half-way up the ladder and want to know how to climb the next few rungs. Luckily there are a number of books out there that a sufficiently motivated student can use to learn more advanced techniques. 

Before I jump in, it’s worth taking some time to clarify what exactly intermediate is. For starters, I’m assuming that you know at least 3-4 different programming languages including at least one object-oriented, one functional and one utility language (eg Perl/Python/Ruby). I’m also assuming that you’ve worked in teams and have successfully accomplished at least one medium scale programming project which involved both program design and implementation. And since computer science is more than just programming, you should have at least a cursory knowledge of computing theory and algorithms, meaning that you know, at the least, what a Turing Machine is and what Big-O notation means. That being said, let’s get on with it:

Structure and Implementation of Computer Programs

Yes, I know MIT uses this book for its intro-level programming course, but it quite clearly states in one of the prefaces that most students taking that course already have some programming experience. It’s certainly a fine book, but I do think that novice programmers won’t quite understand the full power of some of the concepts presented. However, having spent some time in the trenches, you’re likely to have a better understand of what sort of things can make your life easier. This book will teach you a lot about abstractions and algorithms and once you’ve experienced the austere simplicity and inherent power of Scheme, you’ll never think the same way again.

On Lisp

Even though the book is supposed to about “Advanced Techniques for Common Lisp”, the first few chapters are devoted to a rather basic review of functional programming concepts. I’m currently working my way through this book and I think it’s fair game for anyone who has had exposure to functional programming in general and some dialect of Lisp in particular. Be warned however, that this book is quite topic specific. It makes no secret of the fact that many of the ideas that are explored will be useless, or at least very hard to implement without the powerful framework that Lisp gives you. Unless you plan on actually using Lisp for a substantial amount of your work in the future, this book might not be worth the time investment.

Design Patterns

At the other end of the generality spectrum is this classic. It’s authors have gained somewhat legendary status in the software engineering community as the Gang of Four and the book very well deserves its reputation as a must-read for serious software engineers. If you’re bulding production software and use any form of object orientation, you’ll soon be encountering some of the patterns described here. The books purpose is to describe general patterns that continually occur in software. It reads more like a catalog or cookbook than a standard textbook, but that adds to its appeal. Each pattern can be studied more or else separately and references to supporting patterns are made clear. There is also a considerable amount of sample code in Smalltalk and C++. This isn’t exactly a book that’s made to be read cover to cover and you’re likely to get bored unless you have a problem in mind that you’re looking to solve. The best way to get the most out of this book is to treat is as a reference and another tool in your toolkit.

Code Complete 2

Another general software engineering book that will make you a better programmer for the rest of your days. Code Complete is hard to describe in a few words and the best way to get a feel for what it is to read the first few chapters. Code Complete focusses on the actual writing of code. It deals with issues that arise was you and your team sit down to actually implement the design. If you have any experience at all working with a team on a large project, you’ll understand a lot of the issues that are dealt with here. This book isn’t about fancy optimizations or agile team management techniques, its about actually sitting down and writing your code. A must-read for anyone who wants to build better software (which I hope, is everyone who has ever written software).

Compilers: Principles, Techniques and Tools

You may only have a passing interest in compilers and may not be looking to become an expert in programming languages, but learning about compiler technology will help you in numerous other ways as well. A word of warning, this book is a classic but it also edges into the ‘advanced’ domain. I’ve started reading this book in earnest since I became interested in programming languages and I still have a long way to go. This book will force you to develop along multiple fronts: algorithms, data structure, even interface design (by way of designing usable languages). And as Steve Yegge pointed out, starting to write a compiler is not for the faint of heart, because there is no end to it (though that’s something for another post). Read this book only if you are seriously committed to someday becoming an expert programmer.

The Mythical Man Month

It’s a bit dated and a lot of the examples might seem archaic, but the underlying message is clear even after all these years. This book is must read if you’re ever in a position to lead a team ( and chances are, at some point you will be). If there is any book that I would say is a must for a software engineering curriculum, this is it. It can be tempting to skip over the more technical parts, but please don’t do so. Read this book cover to cover and then read it again a few weeks later.


The books I’ve listed above won’t make you an expert programmer, but they will start you on the way and take you a fair distance along. The truth is that it’s much harder to go from intermediate to expert than to go from beginner to intermediate (though I suppose that’s true of all disciplines). I’m looking for a good book on algorithms to add to the above list, but I haven’t come across one that is at the proper level yet. I consider Knuth’s masterpiece to be at a somewhat higher level than what I’m prepared for at the moment. Any suggestions would be welcome.

Going on an Internet diet

Living on a college campus for the past year has meant that I’ve become used to having an always-on high speed internet connection. It’s certainly very convenient to have the internet available at a moment’s notice. Furthermore campus-wide wireless means that I’m not limited to working at my desk. On the flip side though, it’s a challenge to make sure that I stay on the internet only as long as I need (or want to). The internet, for all it’s uses (and perhaps because of them) can be very addictive.

I’m leaving for India today and I’ll be away from an internet connection for at least a day, perhaps two. I do have a decent internet connection at home (256Kbps DSL), it doesn’t really compare to what I’m used to. And there’s certainly no wireless. Last time I was home, this situation was sometimes frustrating. However this time, I’ve decided to do a little experiment. I’ll be home for just over a month and I’m going to try to be keeping my internet usage to under an hour a day. ‘Internet usage’ in this case means email, browsing, reading my blogs and feeds and checking Facebook. I can’t feasibly cram in writing posts along with other activities into a hour time slot. But since I plan on writing regularly, I’ve decided that the best thing to do would be to write my posts offline and then copy/paste them into WordPress. Any time I spend looking up links will count towards the one hour quota though.

There are a number of things that I’m looking to accomplish with this diet. Firstly, the Internet is a great information source, but the signal to noise ratio can be dangerously low at times. I’m hoping that under time pressure I’ll learn to be more discerning about what I choose to pay attention to and what I don’t look at. I hope that I’ll be able to use the extra time to do something productive. I haven’t read as much over the last semester as I should have and I think that this experiment will be a good way to catch up and maybe make a new habit. At the same time, I’m going to be looking closely at whether or not having the internet affects other computer tasks that don’t need connectivity. In particular I’m looking to make substantial headway on my research project which will involve a lot of coding. I’m familiar enough with Python that I don’t need to constantly look up the reference docs, so it’ll be interesting to see if not being online all the time let’s me write code faster (or slower).

There are some details that I need to work around. For example, if I get into a conversation with a friend do I need to disconnect at the one hour mark or can I keep talking if I don’t have anything better to do? I’m also certain that I will ocassionally need to look up Python documentation and I’m not sure whether or not that should count towards my internet usage. I’ll be starting the diet from the start of next week after I’ve reached home and had a chance to recover from the trip. I should have the above questions resolved in a week and after that it’ll be up to me to actually enforce my diet. Whatever happens, I should have some interesting conclusions at the end of it.

Creating extensible programs is hard

As part of my research work I’m building a program that needs to have a pluggable visualizer component. I would like users to be able to create and use their own visualization components so that the main output from the program can be viewed in a variety of ways: simple images, 3D graphics, maybe even sounds. My program is in Python, but I would rather node limit my users to having to write Python visualization code. Ideally, they should be able to user any language or toolkit that they prefer. The easiest way to do this (as far as I can see) is to have the visualizers to be completely separate standalone programs. The original program would save its output as a simple text file and the visualizer would then be responsible for reading the file and performing its actions. However at the same time I would like to make it easy to write Python visualizers and that wouldn’t have to supply their own file reading operations. These goals means that I need to design and implement a stable framework that can handle all this extensibility.

I only really started working on this last night, but in that time I’ve realized that this is harder than it sounds. Here are some the things that I’ve figured that I have to do:

  1. Determine whether the visualizer is a Python module or a standalone program
  2. If it’s a standalone program, save the output as a file and and then call the program with the output file as a parameter
  3. If it’s a Python module, there should be a class (or a number of classes) correspond to visualizers.
  4. The visualizer objects should have a clean API that the main program should be able to use.

While none of these tasks are impossible or require advanced computer science knowledge, they do require a considerable amount of care and planning. Firstly, the mechanism to detect whether the visualizer is a Python module should be robust and accept user input, which means that there has to be error checking and recovery. There also needs to be a stable interface between the main program and modules that are loaded.  There should be clear communication between the parts but also the modules should not be able to interfere with the main program.

Allowing third party (or even second party) code to run inside your framework is not something to be considered lightly. Malicious, or even sloppily written code can have very dangerous effects on your own code. In my case, I’ll be directly translating my programs output to API calls on the Python visualizer objects. Calling non-existent methods would throw exceptions and at the very least I need to make sure these exceptions are caught. I also need to make decisions regarding just how much information the visualizers should get. Luckily for me, I won’t be allowing the modules to be sending back any information, so that makes my job easier.

Writing an extensible program like the one I am now is an interesting experience. I’ve been interested in software engineering for a quite a while now and though I’ve written large programs before, this is the first time that I’ve made one specifically geared towards extensibility. Extensibility brings to the forefront a number of issues that other types of development can sweep under the carpet. Modularity, security, having a clean API, interface design, everything is a necessity for making a properly extensible system. Furthermore, having an extensible system means that you are never quite sure what is going to happen or how your software will be used. This being the first time that I’m making such a system, I’m going to be very careful. I’m putting more time into the design phase because I don’t want to do a rewrite partway through the project. Let’s hope that this experience proves to be a good one.

Software is Forever Beta

Word on the web is that Google just pulled the Beta label off it’s Chrome browser. As Google Operating System has noted, it’s not that Chrome is fully ready for mass consumption but rather that it’s just good enough to enter the fray with Firefox and Internet Explorer and that Google is in the process of sealing bundling deals with some OEMs. There is still work to be done, there are still bugs and their  are important features in the works (including an extension system). But the event raises a question that I don’t think has ever been convincingly answered: when is the right time to take the Beta label off a piece of software?

Wikipedia says that a beta version of a software product is one that has been released to a limited number of users for testing and feedback, mostly to discover potential bugs. Though this definition is mostly accurate, it’s certainly not the complete definition. Take for example Gmail which is open to anyone and everyone, isn’t really out there for testing, but is still labeled beta years after it was first released. You could say that in some ways Gmail changed the software culture by being ‘forever beta’. On the other hand Apple and Microsoft regularly send beta versions of their new products to developers expressly for the purpose of testing.

Corporate branding aside, everyone probably agrees that a piece of software is ready for public release if it generally does what it claims to do and doesn’t have any show-stopping bugs. Unfortunately this definition isn’t as clear cut as it seems. It’s hard to cover all the use cases of a software product until it is out in the real world being actively used. After all, rigorous testing can only prove the presence of bugs, not their absence. It’s hard to tell what a showstopping bug is until the show is well under way. Also, going by the definition, should the early versions of Windows have been labeled beta versions because they crashed so much? With exam week I’ve seen college library iMacs choke and grind to a halt (spinning beachball of doom) as student after student piles on their resource intensive multimedia. Is it fair to call these systems beta because they crack under intense pressure?

Perhaps the truth is that any non-trivial piece of software is destined to be always in beta stage. The state of software engineering today means that it is practically impossible to guarantee that software is bug-free or will not crash fatally if pushed hard enough. As any software developer on a decent sized project knows, there’s always that one obscure that needs to be fixed, but fixing it potentially introduces a few more. However, that being said, the reality of life is that you still have to draw the line somewhere and actually release your software at some point. There’s no hard and fast rule that says when your software is ready for public use. It generally depends on a number of factors: what does your software do? Who is your target audience? How often will you release updates and how long will you support a major release? Obviously the cut-off point for a game for grade-schoolers is very different from that for air traffic control software. Often it’s a complicated mix of market and customer demands, the state of your project and the abilities of your developers.

But Gmail did more than bring about the concept of ‘forever beta’. It introduced the much more powerful idea that if you don’t actually ‘ship’ your software, but just run it off your own servers, the release schedule can be much less demanding and much more conducive to innovation. Contast Windows Vista with it’s delayed releases, cut features, hardware issues and general negative reaction after release, with Gmail and it’s slow but continuous rollout of new features. Looking at this situation shows  that Gmail can afford to be forever beta whereas Windows (or OS X for that matter) simply cannot. The centralized nature of online services means that Google doesn’t need to have a rigid schedule with all or nothing release dates. It’s perfectly alright to release a bare-bones product and then gradually add new features. Since Google automatically does all updates, that means that early adopters don’t have to worry about upgrading on their own later. People can jump on the bandwagon at any time and if it’s free, more people will do so earlier, in turn generating valuable feedback. It also means that features or services that are disliked can be cut off (Google Answers and Browser Sync). That in turn means that you don’t have to waste valuable developer time and effort in places where they won’t pay off.

In many ways the Internet has allowed developers to embrace the ‘forever beta’ nature of software instead of fighting it. However even if you’re not a web developer, you can still take measures to prevent being burned by the endless cycle of test-release-bugfix-test. It’s important to understand that your software will change in the future and might grow in unexpected directions. All developers can be saved a lot of hardship by taking this fact into account. Software should be made as modular as possible so that new features can be added or old ones taken out without need for drastic rewrites. Extensive testing before release can catch and stop a large number of possible bugs. Use dependency injection to make your software easier to test (more on that in a later post).  Most importantly however, listen to your users. Let your customers guide the development of your products and don’t be afraid to cut back on features if that is what will make your software better. After all, it isn’t important what you label your software, it matters what your users think of it. People will use Gmail even if it stays beta forever because it has already proved itself as a reliable, efficient and innovative product. Make sure your the same can be said of your software.