IDEs, education and the UNIX philosophy

I’ve never really been a big fan of IDEs, though I can appreciate how they can help speed up the Edit-Compile-Test cycle. For almost two years now I’ve been trying to use a text editor + command line strategy. However, at the moment I’m in a position where I need to start using IDEs again. In particular my software engineering class uses KDevelop and over the summer I’ll be working on a research project developing an Eclipse plugin, so I need to get familiar with Eclipse before then.

I must say that I was initially put off by the idea of having to use Kdevelop for C++ development in my software engineering class. I was really hoping to be able to use Emacs, GCC and make full time, adding to my current level of Emacs and BASH knowledge. But I have been using KDevelop for about 2 weeks now (somewhat grudgingly I must admit), there are a number of things that KDevelop makes rather smooth. Creating an SVN repository for your project and committing to it is rather well integrated with the interface. At the same time, you can’t directly checkout out a working copy and start working on it as a project: you need to check it out manually to a directory, import the directory as a project and then edit the project options to have it be version controlled. Not exactly a very smooth workflow. KDevelop’s integration with the Qmake program to automatically create Makefiles is also a time-saver. Makefiles are files which contain instructions (used by a program called make) to build large projects containing lots of files. However it is very tedious to write these Makefiles by hand and KDevelop mostly takes that trouble away. Though I’m only just starting to explore KDevelop, I think I’ll enjoy using it.

However, on a larger scale, I’m still in two minds on the use of IDEs. I still firmly believe that IDEs should not be used as primarily teaching tools, certainly not in a course like Software Engineering, which is for people who are fairly sure that they will be computer science students. I learned about Subversion before using Kdevelop. I can understand and appreciate how KDevelop helps speed things up and more importantly when things go wrong or the interface doesn’t quite work the way I want it to, I can easily drop into a shell and fix things. However I know much less about Makefiles, since I’ve never really used them. I know enough about it to understand that there is a significant amount going on under the hood which is being kept hidden from me. If something went wrong I’d be helpless to fix it. It’s not a feeling I’m comfortable with. I really wish that we had been told about Makefiles and how they work and then been told about Qmake and how the integration with KDevelop speeds things up. At this point I really hope we learn about what goes on under the hood and do so soon.

One of the reasons that I am uncomfortable with IDEs is that I’m a strong believer in the UNIX philosophy. This school of thought as applied to programs can be stated thus:

Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

I have some doubts about the text streams part, but I firmly believe in the first two segments. Once again coming back to education, it is important that students learn the UNIX philosophy and learn it well. They may not necessarily be writing completely separate programs for every little thing, but the need to preserve modularity in software is generally acknowledged. By learning how programming actually depends on lots of different tools each with their own functions, but with good interfaces, students can better understand how to write modular software. The IDE encourages the idea that powerful software comes in monolithic chunks. Even if the idea is modular and depends on other programs under the hood, that fact is not always obvious to the student. It is much easier to understand the power of modularity when your tools are clearly modular in nature. Consider Emacs: it is extremely modular and extremely extensible by virtue of it’s embedded Lisp interpreter. When you want to give Emacs the powers of an IDE you string together a number of Elisp packages. More importantly you are actively encouraged to write your own Elisp code to bend Emacs to your own. This flexibility is on a completely different level even when compared to modular IDEs like Eclipse.

I suppose that many of my gripes with IDEs would disappear if they really were integrated. But they’re not. I’m using a bunch of tools for my Software Engineering course: Umbrello for UML, Doxygen for code documentation extraction and Mantis for bug tracking. The only integration here is between Doxygen and KDevelop. This means that designing, implementing and then documenting a feature will mean using several different tools. Umbrello can generate code templates from my UML diagrams, but it can’t update the diagrams as I change the code. That means that when I need to write utility classes or methods, I have to put them in code and then put them in the diagrams again manually. By using an external bug tracker I need to remember to check off on the buglist when I fix something. In a truly integrated environment, the bug tracker would be part of the IDE. As a developer I could then add information to the bug report, such a link to a specific part of the code where the bug exists. Anyone looking at my code later would be able to see that there was a bug associated with that part of the code which had been fixed (or was still active). Of course, this would mean that version control would have to be really tightly integrated so that all the changes ever made could be pulled up and compared at a moment’s notice. If I need to play the juggle-the-program to do my work, I would much rather use distinct programs and tie them together with Elisp or BASH script. In such a case a common text interface would be very useful. UNIX philosophy wins again (to some extent).

Of course, I am picking on the tools I’m using and that may not be a representative sample. I’m also aware that KDevelop isn’t exactly the gold standard in IDEs. At this point some open source enthusiasts will be pulling out the do-it-yourself card: if you don’t like the state of KDevelop, write some code to fix it. An understandable argument, and normally I’m all for it. Do it yourself is a lesson well worth learning if you’re in the technology field. But in this case, I feel it is somewhat besides the point. Like I said, I’d rather just use Emacs. That’s not to say however that I won’t ever be swayed. Right now, I would willingly use an IDE if it unified the tools that I’d be using and actually sped things up.

Ultimately it comes down to the simple fact that as someone who intends to write software as a livelihood (or as part of a livelihood at any rate), I insist on using the best tools for the job. Right now that tool is Emacs + GCC (and all the supporting tools like make). If an IDE does come along that offers me the same customizability and raw power that Emacs, I would not hesitate to give it a fair chance to prove itself. This is not to say that Emacs doesn’t have some problems, but it’s better than the competition for my use. On the same note, I also wish very much that my fellow students would learn to understand what really makes a tool powerful. However considering that one of my classmates commented that he doesn’t like the terminal because of how much he has to type, I don’t think that will happen anytime soon. Education in computer science deserves a good few posts all to itself, and it has it’s own very large set of problems, but if you actually are reading this post, I think you already have enough knowledge to know when your tools don’t quite cut it. Just make sure you act on that knowledge.

A week with Git

It’s been almost a week since I moved to Git for my version control instead of Subversion. It’s been an interesting experience with some amount of learning to do. There are a number of extra things to keep in mind when using Git, but as time goes by, I’m quickly becoming used to them and they don’t seem bothersome at all.

The first major difference is going from centralized to distributed. I use version control to keep my files in sync across multiple computers as well. With Subversion a commit would save my changes and also push them to my remote server. With Git, that is now two separate processes. I would have liked to be able to have them both as a single step, but honestly, it really isn’t much of a problem. I’ve become used to doing a git push just before I leave my computer.

Once I got past that, the next thing that I had to deal with was the fact that a commit even to the local repo is actually two separate operations. You first have to add any changes made to something called the Index. It’s easiest to think of the Index as a sort of staging ground for your changes. You can selectively add files to the Index, building up a coherent set of changes and then commit them together. The Index can take some getting used to, but it is easily one of Git’s killer features. Often enough, you end up editing a bunch of completely different files. They could be files in different projects, or files in the same project that do totally different. When it comes time to commit, you’d like to be able to commit the different changes separately so that you can track exactly what changes you made by project. The Index lets you do exactly that. In fact, git will actually let you select which set of changes to a particular file you want to add to Index in a given add operation if you do a git add --patch filename . This gives you an unprecedented level of fine-grained command over how the change history of your files are saved. I haven’t had a chance to use this level of power yet, but as the semester progresses and I get involved in large project, i don’t think it’ll be long before Git saves the day. See this post for a more detailed discussion of how useful Git’s Index is.

Even if all the changes to the change history could be made only in the Index, that would be an incredible amount of power. But Git’s power stretches beyond the Index to the commit operation itself. TheĀ  git commit --amend operation pushes the changes in the Index into the previous commit which comes in handy when you forget a change, or realize that you need to change something else. And if that isn’t enough, the git rebase --interactive command basically hands you a nuclear-powered swiss army chainsaw with which you can go to town on your repository. Needless to say, this kind of power isn’t too be used without some amount of care and consideration.

Git’s control over committing is impressive, but equally impressive is Git’s branching capabilities. In Subversion, a branch means copying the current files to another subdirectory in the same repository. This means that all branches are visible at all times and that anyone checking out the full repository will get all branches. This isn’t always what you want. Sometimes you don’t everyone to see your changes until they’re ready for prime time and you might want new developers to get the production branch and then later pull additional branches as needed. Since Git is in many ways a filesystem that supports tracking changes, the branch command works different. A branch doesn’t require you to create a subdirectory in the same repo. Instead it creates a new information tree in Git’s internal filesystem. So you can have separate branches with very different directory structure, but without cluttering your visible directory hierarchy. git checkout branchname will move you to a new branch and change the visible filesystem in place to match the branch’s. So you can traverse your file structure as normal without even thinking about the fact that there are other branches. This comes in very handy for automated testing or building. You don’t have to repoint your testing infrastructure to a new directory to test multiple branches. If your branches have the same layout, you can just switch branches and run the same testing scripts.

Merging is also made simple. Suppose you have an experimental branch that you want to merge into the master branch. First checkout the master branch (git will tell you if you’re already there) and just pull in the experimental branch as so:

git checkout master
git pull . experimental

You don’t have to worry about specifying revision numbers, and if some changes in experimental had already found their way into master (via a previous pull maybe), git will simply ignore them (or let you manually choose which edits to commit if there are conflicts). Git merging by example is an excellent article that explores branching and merging with a series of simple examples.

All of Git’s features make non-linear development easy. You don’t have to think too hard about branching off the main trunk or making changes to another part of your current branch. The Index and merging features make it easy to organize and alter changes after you’ve made them. You spend less time and effort managing your commits and you can devote more energy to actually dealing with your code base, safe in the knowledges that your changes will be recorded just as you want them to be. Whether you’re a one-man powerhouse hacking on a number of different projects in your spare time or a member of a commited team working to a tight schedule, Git can be a very powerful tool in your arsenal of code-management utilities. Use it well.