Archive for March, 2008

The Role of Testing in Programming

    It’s often said (and only half-jokingly) that if we built bridges and buildings the way we build software, we’d be living in the stone age. Sure, a lot of software may be buggy, ugly or even downright useless, but that doesn’t mean that something can’t be done to fix it. One of the ways in which a developer can produce more reliable code is by rigorous testing. However for many developers, testing goes only as far as making sure that the program actually runs and gives reasonable output with a small number of typical inputs. However, all that proves is that your program isn’t falling part at the seams. It doesn’t prove that your program will work with the majority of use cases, it doesn’t prove that the program will work in unusual situations and it certainly does not prove a complete absence of bugs.

So how do you prove that your program will work correctly at least in the majority of cases? It needs to be tested hard, rigorously, brutally, unfairly. This doesn’t mean that you should test it beyond the limits of what is feasible or reasonable. Your text-editing program doesn’t have to be able to be able to open your email or edit half the HTML files on the web at the same time (unless you’re writing an Emacs super-clone), but it does have to handle multiple large files open at the same time, color syntax properly and not choke on large copy/paste operations. Deciding what is necessary to test and what can be deemed as outside your programs operating parameters can be a bit tricky at times, but it leads into what may be the more important role of testing in programming: testing forces you to think about your program design.

Fans of unit-testing often stress that unit-tests should be written first, before any of the actual program is written. This is not a easy concept for most programmers to grasp, and even fewer actually put it into practice. How are you supposed to test something when it doesn’t even exist yet? The key idea is that you’re not so much testing your code, but rather what your code should do. To some extent, it involves taking on an outsider’s perspective: a user doesn’t care how the program is written, only if it works. But if you’re writing tests for your own code, you also start thinking about how your program is structured. The idea of unit-testing is that you should test the smallest reasonable components before stepping up to larger portions of code. So you start thinking about what are the smallest parts of your code which need independent testing. Which parts can be safely folded into others and which ones have to be broken down? It also gives an idea of the interactions betweens the parts: what sort of data needs to be pushed around, how is that data affected and is there a chance that the data will be corrupted in all the moving about.

Of course, there is a down-side to this formal, test-first methodology: if you have a rapidly evolving project which has frequent design changes and restructuring, you’re going end up with a bunch of failed tests just as frequently, some of which tests things that don’t exist any more. If this happens too often, you might find yourself spending as much time updating your tests as you are updating your code. That’s never a good thing (of course, completely restructuring your project every week is probably not good either).

Even if a test-first methodology isn’t suitable for you (I don’t do it myself) you should employ formal unit-tests as much as possible. These tests should be automated i.e. they should automatically generate inputs for your programs and check the outputs without human intervention. This might seem painfully obvious, but I’ve seen more than a few students write tests which are essentially manual verifications wrapped in code. This is doubly wasteful as not only do you not test anything which you couldn’t test easily yourself, you’ve wasted time writing code for it.

That brings us to the question of manual testing. Unit testing is certainly very handy and will probably make your code much more reliable than no testing at all, but it won’t find all the bugs and it’s almost impossible to come up with all possible test cases. The real trial of your software comes not from endless automated testing, but from actual users using it out in the field. Believe me, there is no substitute for using your programs in the real world. And that doesn’t mean just using the program yourself. As a developer, it’s very likely that you have your code’s limitations tucked away in your subconscious, causing yourself to be blind to problems which would be obvious to people who actually use it without knowing about how it’s built. Getting real people to use your program in the real world is the best possible test even though it may be more time-consuming and less clear-cut that automated unit-testing. And there are some things (like UI effectiveness) which simply can’t be auto-tested.

Ultimately the role of testing boils down to the following major points. They are not universal and they won’t cover all possible cases, but they might go a long way to making your code more robust.

  1. Test often and rigorously.
  2. Use a combination of automated testing and user testing.
  3. Designing and testing go together: design for effective testing, test to find flaws in the design.
  4. Don’t just fix the bugs your tests reveal: think about how they can affect the rest of the program and how your program will be affected by the actual fixes themselves.
  5. Remember that testing will probably not eliminate all possible bugs, but that doesn’t mean that you should not test, or that you should test forever.

Pair Programming: Pros and Cons

    My computer science course at college has had us pair programming for labs in the first half of the semester. Pair programming is a technique in switch two programmers work on the same computer at the same time: however only one of them does the actual coding while the other checks each line of code as it is written. Every once in a while the two switch places and keep on coding. Over the past two months I’ve come to notice first hand some pros and cons, both from my own little team and the others in the class:

Pros:

  1. Two points of view: When you’re stuck in particularly nasty patch of code, trying hard to work your way through, having a fresh point of view can come in very handy. However since both the programmers are side-by-side there has to be some care taken so that both aren’t always thinking the same thing.
  2. Silly mistakes are quickly caught: Simple mistakes such as syntax errors or repeated variable names can be easily caught and fixed right then and there. This might not seem like much, but it can cut down debug time later and prevent small irritating bugs.
  3.  Better concentration: This isn’t particularly scientific, but it seems to me that too people working have a lesser tendency to procrastinate or get distracted. This results in shorter development times too.
  4. Combining knowledge: Computer science is a vast field and my course is rather fast-paced. It’s hard for any single student to have a comprehensive knowledge of what’s been going on in class, so two of us working together means that we can pool our knowledge and work together. This has the added of benefited of learning ‘on the job’ and not having to dive into our textbook or go to the instructor every time we have a doubt.
  5. It’s a good training ground for large software projects: Very little software is written by one person nowadays, teams ranging from small to big are not just the standard, they are a necessity. Pair programming teaches you a lot of the soft skills you’ll need: tolerance, respect, understanding and responsibility.

Cons:

  1. Skill disparity: This is the number one potential problem. If the partners are of completely different skill levels, you might have one programmer doing all the work or constantly tutoring the other. This is ok if you want to set up a teacher-student relationship or are introducing a new programmer to your system, but if not, it can defeat the entire purpose of pair-programming.
  2. Not actually getting the work done: Though I’ve personally experienced increased concentration when pair programming, for some people pair programming sessions can easily generate in socializing sessions. There are some people who don’t work when there is someone next to them examining their work, these people will not benefit much either.
  3.  Developer egos: This is something that is not likely to happen in a classroom setting, but in more experienced teams, each programmer might try to push their own ideas of how things should be done (both of which may be perfectly valid). These sort of conflicts can be downright disastrous.

So is it worth it? Some research suggests that it is, but there are enough variables to make the answer far from clear cut. If you can get a good partner who you’re comfortable with and can share similar work patterns, it might be very productive. On the other hand, avoid bad partners at all costs. However teamwork is a essential skill to learn, though you must be a competent programmer (or willing to learn) yourself before jumping onto a team. I think that you should try out pair programming every now and then if you can find appropriate partners. Using it in a classroom lab setting is also a good idea just as long as there are enough individual projects to make sure that one person isn’t doing all the work. But no matter what, it’s important to remember that pair programming is just another tool and will not compensate deficiencies such as incompetent programmers or poor working conditions and tools.

A time for planning and a time for coding

    Over Spring break I’ve been reading the older archives of own of my favorite blogs: Coding Horror. Though a lot of the posts are worth reading for anyone involved in computer programming, one that I feel is especially worth mentioning is Development is Inherently Wicked. The article talks about how it is almost impossible to fully plan out a complete software project and then have that plan hold throughout the process of creating the software. Though I’ve personally never been involved in a large scale software project, I’ve come to realize how true this is even with small projects.

For the past week I’ve been busily writing a Python package to allow interaction with a Hemisson robot. My initial planning was very simple: I would have only a very simple function-based interface.  The actual hardware interaction would be separate from the part the user used and there would be a simple configuration file which the user could edit. Was this enough planning with which to start writing code? I was afraid it might not be, but in the end it turned out to be a good thing. I managed to write the hardware interface in just a few hours and in just about 80 lines of code. It would have taken far less time, but there was some ambiguity in the Hemisson documentation that made me fiddle around directly with the robot to get things right.

It was when I started to build the configuration system that I had to deal with the other side of the coin. It was very tempting to jump in and quickly come up with some sort of config file syntax and then write up a parsing routine for it.  It would have worked and it probably would not have been much work. But I decided to fight the temptation to hack away and instead read up on Python tools for doing this sort of thing. After reading about online and posting to the Python-tutor mail list I came upon a simple, more elegant solution: the config file would be a Python file itself and I would make use of Python’s intrinsic ability to import modules stored in external files to access it and read. A little bit of planning thus eliminated the need for a rather large amount of complexity and code to handle that complexity. I’ve just finished another round of coding: a part of the actual module that the user will be using. I’ve managed to create simple functions for moving the robot and for accessing its sensors. However since the robot does not include any hardware to keep track of its position, writing more complex functions such as turning through a particular angle will be significantly more challenging. Once again, I need some planning and research.

The lesson I’ve learned is that doing a good job of making software involves a fairly fine balance between making careful plans and just sitting down and pumping out some code. So how can you know which one is more important? The answer is two-fold. Firstly, its a mistake to think that planning and coding are separate actions. They are part of continuous loop and a healthy software project must treat them as such. There has to be a decent level of feedback from the code to the plans and the plans have to make sure that bad code isn’t written just because it is easy and the first thing that comes to mind. Even while I was writing my hardware control routines I had to think about how I would deal with different hardware configurations and how to let the program parse and adapt to the configuration information. Like a lot of computer science, this is a good example of choosing the middle path: realizing that either extreme can lead to a badly implemented or even incomplete project and that staying close to the middle is the best way to go.

If you’re interested in learning more about this approach to programming projects, read though the entire article and Coding Horror and the book that it references.

Enter the ByteBaker

I’ve been blogging on and off since 2005, more off than on. As I’ve mentioned before, my spotty record is mostly due to the fact that I’ve never really had anything very compelling to put out on the Internet. But as I continue exploring computer technology and continue my formal studies in the field, I realize that I’m getting exposed to lots of different and interesting ideas and coming up with many of mine. Most of them are probably not very original, though perhaps people still find them interesting. But irrespective of that, I know that a lot of these thoughts and experiences are things that I would like to keep on record and would like to share with other people out there.

With that in mind, I’m going to make a commitment to recording my thoughts and ideas over the next few years as well as keeping a record of my projects and experiments. I’ve registered a domain name and this blog now routes to it. I’m still running off WordPress.com and will be for some time, but I’m going to be posting everything under my new website : The ByteBaker.

Why call it The ByteBaker? Many people have differing opinions on what computer science really is. According to MIT professor Gerald Jay Sussman, computer science is not a science and is really not about computers. Over the last few years I’ve come to agree with him. Computer science in general and programming in particular seems to be to be quite similar to cooking: If you start mixing ingredients at random, you might come up with something that is edible, but it probably won’t taste very good. To make a good dish you need an understanding of your ingredients and utensils as well as a fair amount of improvisation and inspiration. Computer science is similar that you need to understand the concepts and tools that are a part of your trade and to actually produce something novel and interesting you need a healthy dose of imagination and daring. While cooks and bakers use various raw ingredients such fruits, vegetables, spices, meats, etc., for computer scientists all our creations can be described in terms of bits and bytes: hence enter the ByteBaker (also alliteration is kinda cool).

So what does all this mean for you? Probably not a lot at the moment. You might want to upgrade your bookmarks to point to The ByteBaker instead of Xtreme Computers, because in a few months I will be moving to a different hosting solution so that I can have more control over the WordPress installation. If you’re a feed subscribers, you don’t have to do anything at all: I’ll still be publishing at the old feed URL. So just sit back and enjoy the ride!

OpenEmbedded bug found and squashed

I’ll be using Python extensively for my robotics project with Gumstix. The goal is to come up with a set of modules that let the user create high-level behavior programs for the robots without worrying about how to control the hardware. Of course none of this is going to happen if I can’t actually get the Python interpreter to work on the Gumstix.

The basic Linux installation on the Gumstix doesn’t come with Python, so I had to put in on there manually. The Python packages for Gumstix are very modular, which lets installation of only the needed packages without anything extra. Instead of just installing the required packages on the Gumstix I decided to create a fresh root filesystem image, since I would be installing it on multiple Gumstix. What I really needed was the Pyserial package which provides a nice Python interface to serial ports. That was the only package that I added to the buildscript (more on that later), hoping that it would pull in all the required dependencies. After I finished the build and reflashed the Gumstix using the wiki instructions, I booted up and started the Python interpreter. But to my horror, it wouldn’t load the Pyserial module. Apparently it needed an older implementation of various string functions which weren’t pulled in. So I repeated the process, by adding the older module to the build script. Even then the Pyserial module wouldn’t load because it couldn’t find the struct module. By this time I was starting to get frustrated, because the struct module should have been pulled in with just a basic Python implementation. The struct module handles conversions between Python values and C structs represented as Python strings. This module not being found meant that somehow the entire OpenEmbedded Python system was fundamentally broken.

After a bit of research with the help of Google, it turned out that there was an error with the python-2.5-manifest.inc file which contains a listing of all the files that make up the base Python system. This file had an incorrect reference to which files actually implemented the struct module. This problem had occurred in the OpenEmbedded system around November last year and had been fixed, but somehow the fix had not found its way into the Gumstix version of OpenEmbedded. So I’ve dutifully filed a bug report and submitted patch, and I’m hanging on to my corrected Python manifest while the patch is implemented (which will hopefully be soon).

This whole experience of tracking a bug in a system I hadn’t created and finding a fix for it was quite an interesting experience. On one hand it was quite frustrating because the problem wasn’t it something I had an intimate knowledge of. On the other hand it forced me to learn a lot about the OpenEmbedded build system and I also learned how to go about looking for documentation, all of which will come in handy as I keep working with Gumstix and the OpenEmbedded system. And last but not least, it led to my first ever patch submission for a large software project — not much of a contribution, but it certainly makes me feel good !!

Setting up Kermit to work with the Gumstix

One of the most direct ways to talk to a Gumstix is via a serial port. Though a serial port is rather slow and is not the best for day to day operation, it comes in handy to do initial setup and repair. It’s also useful for flashing the Gumstix and installing a custom kernel and filesystem. Since I’m running a Linux host machine, I have two options for programs for connecting via serial: minicom or kermit. I started off using minicom because it was in the Arch Linux software repositories and because the configuration was very easy. However the problem is that the Gumstix cannot receive minicom file transfers, which make minicom useless for reflashing the Gumstix.

After compiling Kermit from source and starting it on the proper device, there are a number of options to be set before one can actually connect to the Gumstix. Doing this everytime is extremely tedious and error-prone so it’s easiest to create a simple script with the following:

#!/usr/bin/kermit +
kermit -l /dev/ttyUSB0
set speed 115200
set reliable
fast
set carrier-watch off
set flow-control none
set prefixing all
set file type bin
set rec pack 4096
set send pack 4096
set window 5
connect

I saved this script under the name konnect and placed it directly in my /bin directory. After marking it as executable with a sudo chmod +x konnect, I could execute the script and have it open up Kermit with all the required configurations in place. If you want to use this script, you should change /dev/ttyUSB0 to whatever serial device your Gumstix is actually connected. My laptop doesn’t have serial ports so I connect via a USB-serial adapter. You should also make sure that the device can be accessed by the user you run the script as or else it will keep throwing up permission denied messages. If you don’t want to mess with file permissions, you can run it as root via su or sudo.

The Gumstix Game Plan

    The last few weeks have been rather busy, but with at least two light weeks in front of me, it’s time to get started with something serious. My first project with the Gumstix will be to use them to drive the Hemisson robots. I had thought about creating some sort of a simple “Robot Control Language” and then have the Gumstix translate that into serial commands to drive the Hemisson. However inventing a computer language, no matter how simple isn’t an easy task and I would like to have my first project be something simpler.

The point of using a Gumstix to talk to the Hemisson is so that a user can write higher-level algorithms without worrying about details such as making the wheel turn. I also want to use a programming language that has a relatively small amount of syntactical baggage and where it is easy to use function libraries without endless recompiling and linking. Luckily I know just such a language — Python. Python’s modules make it easy to create, use and distribute useful code. Being interpreted, there is no compilation needed which means that a user can easily write and debug programs on a host and then run them on the Gumstix without any change. Also important is the existence of the pySerial module which allows simple access to a machines serial ports: something which will invaluable to me.

Eventually performance might become an issue: the Gumstix is after all, somewhat limited in terms of its hardware and users might want to implement complex algorithms to determine the behavior of the robots.  It may become necessary to re-implement the control library in C for the added performance edge. However since Python can easily operate with C functions, the Python programs won’t have to be rewritten (though some alterations will also be needed). You could say that in some ways the Python module will be a prototype for a more powerful and efficient C implementation sometime in the future, but I can’t guarantee that at the moment.

My first step is to brush up on Python and get used to the pySerial module and see what functionality it offers. I’m not going to be using the Gumstix for a while, but rather communicating directly from laptop to the Hemisson until I have at least a rough working implementation. But before I start working on a final implementation, I will be testing the serial connection between the Gumstix and the Hemisson and making sure that Pyserial can work with it as well. With all the preliminary steps completed I can start work on a robust, full-featured module. I’m not quite certain how much work the module should handle i.e. should it just translate movement instructions from the Python code to serial or should it have algorithms to implement things like turning through angles or rotating? These are not decisions to be taken lightly because they will have a direct effect on how users will be writing their programs, and its something that I’ll be discussing with my professor. The best idea might be to implement some higher level functions but leave the underlying simpler functions open as well.

I’m not going to venture a timeline for the project, because I have other priorities which take precedence (classes for example). But I would like to have a working version by the end of the semester (beginning of May), progressing to something which is full-featured and reasonably bug-free by the end of summer (late August). With Spring Break starting in a week, I should be able to familiarize myself with both Pyserial and the Hemisson serial interface in two weeks time. After that its just a matter of proper planning and getting things working.

Meet the Hemissons

    The Hemissons are small semi-autonomous robots made by a Swiss company called K-Team. They began as extra-curricular projects by students at the Swiss Institute of Technology and are now widely used by teachers and hobbyists. The Hemisson has a number of features which make it ideal for small scale robotics experiments.

Each Hemisson is a small mobile robot driven by two separate motors driving its two wheels. It can sense obstacles by means of a total of 8 infra-red sensors, 6 on the sides and two on the bottom (which allow it follow marks on the ground). The brains of the Hemisson is an 8-bit Microchip PIC 16F877 microcontroller running at 20Mhz which can be programmed with a simplified 35-instruction set. It is also packaged with enough RAM and flash memory to host a simple operating system that can effectively control the robot and execute complex behaviors in response to sensor inputs. The Hemissons also come with 6 switches which can be used to turn them on, switch from a running to a programming mode and execute a number of predefined behaviors (like moving in a circle or following a line).

What makes the Hemisson attractive to robot hobbyists is that you don’t have to stick to the simple set of predefined behaviors. Each Hemisson has a serial port which can be used to directly control the robot from a host computer by sending a set of commands which the HemiOS operating system then executes. There is also an extension connector which allows hook-up of I²C compliant peripherals which can also be accessed via the serial port. K-team sells a number of sensors which come with their own microprocessors and software. In case you are feeling particularly adventurous it is possible to directly access the flash memory, thereby replacing the HemiOS operating system with your own concoction (though it seems that you need a special connector to be able to do this). Of course you can always alter the flash memory over a serial connection.

The Hemisson robots can be simulated with the Webots robot simulation package which allows you to create a virtual world with a virtual robot, program it in C, C++ or Java and then set it loose to see what happens. Though you can cross-compile a program to make it run directly on the Hemisson, actually getting it to the Hemisson’s flash memory is slightly complicated (and not covered by warranty). Bu K-Team provides a graphical programming environment it calls Bot-Studio which lets you define a behavior for the Hemisson based on the concept of finite-state machines, and these FSMs can donwloaded to the robot over the serial connection.

I will personally be making use of the serial connection to let a Gumstix essentially “drive” a Hemisson by issuing it commands over serial and listening to responses from the sensors. Because it is possible to listen to the I²C bus over the serial connection, we could attach a wide variety of sensors to the Hemisson and let the Gumstix do the thinking. While I go figure out how to do that, I’ll leave you with a nice picture (from the Hemisson website):

Next Page »