Design for unit-testing

I’ve written before about the role of testing in programming and as I’ve written more code (and unit tests) over the past few weeks, my conviction that unit-testing is useful for more than just determining program correctness has become even stronger. In my previous post I spent about a paragraph exploring the other benefits of testing, I think it’s time that I offered a more detailed view of the alternate advantages of unit-testing.

As I started working on my last project for the semester yesterday, one of the things at the top of my mind was that the professor would be running his own unit-test on our programs. Our other projects had clear descriptions of what methods we were to create and how they should behave. But this project, a simple word counter written in Java, was different in that we were only told how the program as a whole should respond. We could shape the internals as we wished. Though it was very tempting to create a monolithic system (especially since the problem was rather simple), I decided that I should design with automated unit-testing in mind. Rather than making design complicated, this choice actually made things clearer.

It was obvious that the UI would have to be completely separate from the actual work portion of the code. Not only that, but that values returned from the methods doing the work would have to be in a form so that they could easily be compared to expected results. This meant that while the UI would deal with output formatting, there should not be very much formatting required in the first place. This guided me in the choice of the data structures that would eventually be returned.

Another area in which designing with testing in mind is useful is in determining if methods should be private. JUnit 3.8 does not support direct of testing of private methods. There is a certain amount of debate regarding whether or not this is a good thing, but this restriction does force a design methodology that can be beneficial. The only methods that should be private are the ones that need not be tested separately, that is, if the calling method passes testing, it automatically means that the private method is also working correctly. Though I didn’t need any private methods in this particular project, I did before and keeping the limitations of private testing in mind resulted in what I believe to be cleaner, more readable code.

Of course, it is easy to go overboard with unit testing. Your program shouldn’t be endlessly subdivided into lots of tiny functions just so that you tests every tiny chunk of code. Unit testing can help you design cleaner simpler program, but only if your design is pragmatic to start with. Like I’ve said before, no amount of coding tricks and development methodologies will fix a fundamentally broken design.

Jeff Atwood of Coding Horror said some time ago that unit-tests should be first class language constructs. At the time I thought that this was a bit overboard, but I’m coming to realize that this might be a good thing. Any language which has a decent error-handling mechanism would be able to bake in support for unit testing and having a unit-test mechanism directly in the language would probably encourage students to use it (and more importantly teachers to teach it). In my CS course we’ve using Unit-tests from early on, which I feel was very good decision on the part of the professor (even though many of my fellow students don’t quite seem to grasp it’s full importance right now).

Designing for unit testing encapsulates a lot of the ideas that are a part of good software design: UI separation, abstraction, code reuse and readability. Unit-testing is also a perfect example of abstraction: just worry about the big picture and the details will take care of themselves. So the next time you find yourself dealing with a big project, design for unit-testing and chances are you’ll be making better code than if you weren’t. Keep in mind though that no amount of unit-testing will replace actual user-testing…so make sure you get around to that at some point as well (hopefully as soon as you have something a user can actually use).

IDEs for Beginnning Programmers

One of the greatest problems facing people learning how to program is which set of tools to use. The problem lies in the fact that there are a plethora of different IDEs and toolkits that claim to simplify various aspects of the programming process. But the truth this that all of these IDEs are very complex in nature and require a significant time investment on the part of the programmer, i.e. the programmer must first learn the use of the IDE before he/she can become more efficient. While that may be acceptable or even professional programmers, it is certainly not what a beginner wants. A beginner should be able to learn the intricacies of programming and proper use the language at hand, rather than having to spend hours learning to use the IDE before even properly understanding what it’s for.

I’m currently teaching myself Java and Python with a little bit of Scheme. Here are  the three IDEs that I think a beginner will find very useful:

BlueJ for Java.

BlueJ uses a graphical method for aiding learning. The classes you create and their relationships are shown in a clean, uncluttered graphical  presentation, similar to a UML diagram. When you create a new class, it automatically creates a skeleton which you can fill in with your own code. It lets you develop classes separately and create objects for testing. Two big advantages are that it frees the student from having to write a ‘main’ function everytime and that parameter and return values for functions can be entered and viewed directly, without requiring the student to write code solely for I/O. BlueJ also provides you with a simple interface to create JAR files from your programs so that they’re easily distributable. You can choose whether or not you want to include the source code with a JAR file. My only complaint is that it’s default editor is a bit too simplistic, even for my tastes and there is no easy way to replace it.

IDLE for Python

This certainly my favorite IDE, just as Python is my favorite language. IDLE comes with the default Python installation so you don’t have to go looking for it. On opening it, you get Python’s interactive interpreter but it’s easy to create a file for a program using the New Window command under the file window. The editor is simple, but offers both syntax highlighting and auto-indentation which will make anyone’s life easier. To run a program, all you have to do is type it out and hit F5. You can’t get much simpler than that. All I/O is handled by the interactive interpreter, so no worries there. The only problem is that it is really ugly, but then again, you can’t have everything.

DrScheme for Scheme

This is actually quite similar to IDLE in that it packs an interactive interpreter as well as one-click run functionality. However the interpreter is limited to simple statements, but that shouldn’t be a problem as there is an editor window open by default.  If you’re using the How to Design Programs book, DrScheme is a must as it contains a number of subsets of Scheme that grow in complexity as your knowledge increases. The editor also contains a few more advanced features like definition hiding and a class browser that will be of help as you progress. There is also a way to make executables from your Scheme code, but I haven’t tested this myself. There are no obvious drawbacks, except that it is somewhat slow to start.

Of course any programmer should keep in mind that strictly speaking one doesn’t really need an IDE at all, you can get along fine with a text editor and a compiler/interpreter. However choosing the proper IDE can save you a lot of worry, especially if your language is something like Java which really isn’t designed to be meant for beginners. Once the again, the choice is finally up to you. But whatever you do, have fun, because that’s the way programming is supposed to be.

Introductory Books for Beginning Programmers

I’ve recently started learning progrmming seriously and so I’ve been on the lookout for good books to learn from. So here’s a short list of books that I’ve found useful. They deal with a variety of languages and concepts and the best thing is that they’re all absolutely free. Please note that these books would probably be most useful for someone in the last two years of school, though older people shouldn’t have a problem. I’m personally using them as a sort of prep for studying Computer Science in college and so only time will tell if I’ve been successful.

How to Think Like a Computer Scientist – Python Version

Python by itself is a very good programming language for beginners (unless you’re less than ten years old in which case I would suggest Logo). Combine that with a good book and you get a winning combination. The book’s style is clear and cluttered and the chapters are fairly self-contained. It does a good job of introducing procedural programming first before moving on to object orientation (which can be quite a difficult concept for beginners). My only real gripe is that there seem to be too few exercises, which sort of leaves you on your own to find something to do.

How to Think Like a Computer Scientist – Java Version

This is the original Think Like a Computer Scientist book, and unfortunately it’s one major flaw isn’t really something that can be fixed: the choice of language. Java as a language may be very nice, but it’s certainly not too fascinating for beginners. If your learning programming on your own, like I am, I would recommend starting this book after you’ve come to grips with the object oriented matter in the Python book. That issue aside, it is a very good book and I personally like its style slightly more than I do the Python one’s. One major scoring point is that there are a number of exercises at the end of each chapter which involve both writing and reading/fixing code. This book will also come in handy if you’re studying for the American AP exam, but I’m not quite sure if it covers all the bases. I would suggest combining this with the BlueJ IDE, which’ll let you sidestep many of the practical hurdles involved in using Java as a beginning language.

How to Design Programs

This book is designed from the ground up to make you learn programming that is data-centric, i.e. the program’s very purpose for existence is the data that it manipulates. Unlike other programming books that focus on specific concepts as a path around which to structure your learning, this book focuses on data: you start by using smaller, atomic types of data and then move on to using mroe complex data structures. This book uses the functional programming language Scheme. But this books comes with its own dedicated environment: DrScheme. DrScheme helps beginners tremendously by hiding obscure syntax features until the time is right. It does this by providing not just standard Scheme, but a number of subsets containing only the features that you will need. As you progress through the book you move on to richer and richer subsets until finally you’re ready to use full-fledged Scheme. The book focusses on the ‘why’ of a program rather than the ‘how’.

Structure and Implementation of Computer Programs

Think of this as the last one’s big brother. SICP has been the textbook for MIT’s introductory Computer Science for the better part of two decades. As you can well imagine, this not for the faint of heart. However, once you set your mind to it, you’ll find that the book deserves its reputation as a computer science classic. It’s written in a simple no-nonsense style and like HtDP, it teaches you programming, not a programming language. It uses Scheme, but it makes and effort not to let your attention be drawn to what language you’re using. The book drives home the fact the computer is just a tool and your head is where you have to do the real work. That being said, there is probably no point in reading this book unless you intend to make computers your career. Also, having some amount of programming experience would help you on your way. This book is also rather intensely mathematical, and so make sureyour math skills are well polished before you embark on this journey.

Programming from the Ground Up

This book takes a different approach to programming: it’s basic premise that you can only really learn how to make a program if you understand how the computer ticks inside and what it does when it runs your program. As a result of that philosophy you are required to get up close and personal with the computer and that means assembly language. Yes the book uses assembly language (x86 assembly to be specific), but all the examples are very thoroughly explained and if you have patience in abundance, you shouldn’t have any problems. What sets it apart from SICP, is that while SICP approaches programming from a mostly theoretical aspect, this approach is decidedly practical. Again, probably not worth your time unless you plan on computer science as a career.

Personally, I think all the above are very good books. Of course there is the inevitable question of choice. I’m currently working my way through the How to Think Like A Computer Scientist books (yes, both of them). Once school is over I will try push through How to Design Programs and finally go through SICP and Programming from the Ground Up. All in all, I think all of that should keep me busy for the better part of a year. So in a year’s time, come back and check on my progress. If you’d like an IDE to go with your new book, check out the next post.

The Ackermann Function in Java: Why Computers are Stupid

I’ve started teaching myself Java, right from the basics and as a guide I’m using the Java version of the How to Think Like a Computer Scientist book. One of the exercises at the end of the fifth chaper (called Fruitful Functions) is to implement the Ackermann Function as a recursive method. The Ackermann function is mathematically defined as:


Now, the Ackermann function is quite well suited to computerization, it takes little real intelligence to solve for any two numbers, and is mostly repetitive calculation (which computers are good at). It took me less than a minute to implement the function as a Java method as follows:

public static int ackerman(int ack1, int ack2){

if (ack1 == 0)

	return ack2+1;

else if (ack1 >0 && ack2 == 0)

	return ackerman(ack1-1, 1);
else if (ack1 >0 && ack2>0)

	return  ackerman(ack1-1, ackerman(ack1, ack2-1);
)

I passed it to the compiler and the compiler replied with a cheery: “missing return statement”. Since I already had three return statements, that meant that there was a possible path through the method where the method would end without a value being returned. The Ackermann function doesn’t work with negative numbers, so I had already implemented a check for negatives before the function call. I tried putting in another check for negatives in the function itself, but that didn’t work. By this point I was getting rather frustrated because the above code would catch all non-negative numbers and produce appropriate returns. I ran the code in my mind with a few small numbers and everything seemed to check out as it should, the values of ack1 and ack2 would keep on reducing until ack1 hit zero and the method would end with a proper return.

Finally on a hunch, I decided to remove the last if-else and make the last return statement free-standing. Semantically it was the same thing, because there was only one possible case if the first two conditions were not satisfied. And for some reason the compiler thought that this new version was perfectly passable. I haven’t entirely ruled out the possibility that there was really some path that would have resulted in no return. But I think that it is far more probable that the compiler for some reason couldn’t handle the multiple recursions and simply gave up. Of course, I’m not an expert in these things and if someone knows of a proper explanation please let me know. Until then I’ll stick to the knowledge that a computer is still quite some distance away from what would be common sense to a human (or at least the Java compiler is).

Cross-platform Programming: The Details

In the last article, I took a general look at cross-platform programming. From a developer’s point of view, there are a number of ways of programming cross-platform. While maintaining separate source code trees for each platform would make your program “fit in” the most with each platform, it would be almost like maintaining a different program for each platform, a lot of work indeed. One solution is to keep the same source code and use a different compiler to create a different executable for each platform. While this reduces the amount of work you have to do, using this method means that you have to be careful not to use platform-specific methods. And the performance of your program will vary according to the quality of the compiler use.

Probably the simplest solution for a cross-platform developer would be to write in an interpreted language. The upshot is that you can (mostly) forget about dealing with the platform, because the interpreter will be the same irrespective of the platform and you can be pretty confident that your code will run the same on whichever platform you make it work. On the other hand, this method relies heavily on the proper performance of the interpreter. However, if you use one of the more common interpeted languages like Python or Perl, that is unlikely to be a problem. These interpreters have been out there for years and have been thoroughly tested on all of the more common platforms out there.

And now for the Java Virtual Machine. The JVM is just what it sounds like: a sort of pseudo-computer which conveniently lets you ignore the real computer. But there are two important things that make the Java Virtual Machine different from interpreted languages. Firstly, the JVM is independent of the actual Java language. Though Java is the primarily language, there are a number of other languages which can be translated into bytecode that can be run on the JVM. There are also implentations of Python and Ruby for the JVM, though these are considerably less complete than the actual implementations. The second reason will be more appealing to developers focussing on GUI applications: Java has it’s own GUI toolkit called Swing which can provide a consistent look-and-feel across all platforms. Your program not only functions the same across all platforms, it looks the same as well.  Most other interpreted languages require an external GUI toolkit which means that the end user has to make sure that the relevant libraries are installed (or you, as a developer have to package the libraries with your program).

Talking about GUI toolkits, there are a number of toolkits out there for you to choose from. Qt and GTK are probably the more popular ones, but wxWidgets is gaining popularity. GTK and Qt are both non-native toolkits, in that they use their own drawing engines. Qt now uses the drawing API of the platform to give a more native look-and-feel. GTK on the other hand uses a number of different engines some of which emulate the look of native widget sets. wxWidgets however uses the native drawing system of the platform itself and only provides a thin abstraction layer on top of it. This means that your apps will look like native apps because they are effectively using the same set of graphics widgets. GTK is written in C while Qt and wxWidgets are in C++, but bindings for all of them exist in many popular cross-platform languages, so you can pick and choose at will.

Though I’ve presented cross-compiling and interpreted languages as two different solutions, you can take a middle path: writing part of your program in code that will be compiled separately for each platform and part in an interpreted language. The Firefox browser uses this technique. The Gecko layout engine is written in C++, but a large part of the application is made using interpreted scripting languages like XUL, CSS and JavaScript, allowing it to be easily ported. Besides Windows, Mac and Linux,  ports are available for BeOS, SkyOS, FreeBSD and OS/2.

Ultimately it comes down to you as a developer to decide which method and technologies will make your worker easier and the end product. My personal favourite is Java/Swing for heavy projects and Python/wxWidgets for smaller personal work.