Another discovery: DokuWiki

Looks like this is going to be a semester of discoveries for me. While working on my data visualization project,  I discovered IRC. While working on my parallel programming project, I discovered a great lightweight wiki software called DokuWiki. We’re a small team (2-4 people depending on how you count) and we needed a place where we could gather our findings and record our work. We tried to use Google Wave in the start, but it didn’t quite work out. Wave is nice when you need to collaborate in real time and what you’re doing is mostly brainstorming. But if you’re working at a slower pace and need more structure, it just doesn’t feel right. More on this later.

What we wanted was a solution that was quick and easy to set up, was resource-light and could be hosted locally. We thought about larger wiki systems like MediaWiki but we settled on DokuWiki. First off, it’s flat files: no database or SQL required. It makes a point of being lightweight, so there’s no WYSIWYG. Instead there’s a simple wiki syntax instead that suits us hacker types just fine. Looking at the configuration page you can easily tell that it’s focused on functionality rather than looking pretty. But that doesn’t mean that it short on features or good design. It tracks revisions (with short commit messages as well) and has a really nice visual diff tool. Using Git for the better part of a year has made me really love good diff tools.

The wiki syntax is also well thought out. There’s little redundancy but strong features. And CamelCase is optional. Links are enclosed in double square brackets in the form of [[url|link text]] and you can make new pages just by putting in an empty link. It supports multiple level of headings and generates an automatic table of contents for any page with more than 3 headings. Though the design is simple and minimal, it does look pretty good. It’s clean and standardized so you won’t get lost and the edit controls are neatly tucked into bars at the top and bottom. It looks more polished than a default MediaWiki setup, for example. It’s easy to tell that DokuWiki is meant to be a clear documentation tool as opposed to another “build my website quickly” software and in our case, that’s a good thing.

And did I mention the sourcecode support? It uses the GeSHI syntax highlighter, meaning that it supports a wide variety of languages (including some fairly obscure ones like Lotus Formulas and LocoBasic). You can also turn snippets of code into downloadable files by just specifying a filename. You can tell that this was a tool made for hackers by hackers. And of course I just love it. Uploading images is easy, but if you want to add files with other extensions, you need to edit a config file which shouldn’t be a problem for most people who’ll be using this sort of software.

There are a number of other goodies in the bag that I haven’t had time or cause to investigate (including plugins). The fact that I’m not administrating the installation means that I won’t be playing around with it as much as I could, but I think I can turn it into a user point-of-view experience. I wish there was some easy theming support, but I can live without it. This is the first time that I’ve used a self-hosted wiki and though I’ve had experience with other wikis in the past (I like PBWorks) I think I’ll definitely turn to DokuWiki if I need a simple but strong wiki for a code-focused project in the future. I might consider hosting it on my personal server in the near future too and get some more chance to play with it. Right now I’m perfectly willing to keep it simple and focus on my real work (more on that later too).

The Documentation Problem

Over the past year and a half I’ve come to realize that writing documentation for your programs is important. Not only is documentation helpful for your users, it forces you to think about and explain the workings of your code. Unfortunately, I’ve found the tools used to create documentation (especially user-oriented documentation) to be somewhat lacking. While we have powerful programmable IDEs and equally powerful version control and distribution systems, the corresponding tools for writing documentation aren’t quite at the same level.

Let me start by acknowledging that there are different types of documentation for different purposes. In-code documentation in the form of comments are geared toward people who will be using your code directly, either editing it or using the API that it exposes. In this area there are actually good tools available. Systems like Doxygen, Epydoc or Javadoc can take formatted comments in code and turn them into API references in the form of HTML or other formats. Having the API info right in the code, it’s easier to make sure that changes in one are reflected in the other.

User-oriented documentation has slightly different needs. As a programmer you want a system that is easy to learn and is fast to use. You also want to be able to publish it different formats. At the very least you want to be able to create HTML pages from a template. But you also want the actual source to be human-readable (that’s actually a side-effect of being easy to create) because that’s probably what you, as the creator, will be reading and editing the most.

Then there are documents that are created as part of the design and coding process. This is generally wiki territory. A lot of this is stuff that will be rewritten over and over as time progresses. At the same time, it’s possible that much of this will eventually find its way into the user docs. In this case, ease of use is paramount. You want to get your thoughts and ideas down as quickly as possible so that you can move on to the next thought. Version controlling is also good to have so that you can see the evolution of the project over time. You might also want some sort of export feature so that you can get a dump of the wiki when necessary.

Personally, I would like to see the user doc and development wikis running as two parts of the same documentation system. Unfortunately, I haven’t found tools that are quite suitable. I would like all the documentation to be part of the same repository where all my code is stored. However, this same documentation needs to be easily exported to decent looking web pages and PDFs and placed online with minimal effort on my part. The editing tools also need to be simple and quick with a minimal learning curve.

There are several free online wiki providers out there such as PBworks and WikiDot which allow the easy creation of good looking wikis. But I’m hesitant to use any of them since there isn’t an easy way to easily tie them into Git. Another solution is to make use of Github’s Pages features. Github lets you host your git repositories online so that others can get them easily and start hacking on them. The Pages features allows you to create simple text files with either the Textile or Markdown formatting systems and have them automatically turned into good looking HTML pages. This is a good idea on the whole and the system seems fairly straightforward to use, with some initial setup. The engine behind Pages, called Jekyll is also free to download and use on your own website (and doesn’t require a Git repository).

In addition to these ‘enterprise-quality’ solutions, there are also a number of smaller, more home-grown solutions (though it could be argued that Jekyll is very homegrown). There’s git-wiki, a simple wiki system written in Ruby using Git as the backend. Ikiwiki is a Git or Mercurial based wiki compiler, in that it takes in pages written in a wiki syntax and creates HTML pages. These are viable solutions if you like to have complete control of how your documentation is generated and stored.

Though each of these are great in and of themselves, I still can’t help feeling that there is something missing. In particular, there is lack of a common consensus of how documentation should be created and presented. Some projects have static websites, others have wikis, a few have downloadable PDFs. Equally importantly there isn’t even a moderately common system for creating this documentation. There are all the ways I’ve noted above, which seem to be the most popular. There are also more formal standards like DocBook. Finally lets not forget man and info pages. You can also create your own documentation purely by hand using HTML or LaTex. Contrast this to the way software distribution works (at least in open source): there are binary packages and source tarballs and in many cases some sort of bleeding-edge repository access. There are some exceptions and variations in detail, but by and large things are similar across the board.

Personally, I still can’t make up my mind as to how to manage my own documentation. I like the professional output that LaTex provides and DocBook seems like a well-thought-out standard, but I’d rather not deal with the formatting requirements, especially in documents that can change easily. I really like wikis for ease of use and anywhere editability, but I must be able to save things to my personal repository and I don’t want to host my own wiki server. I’ve previously just written documentation in plain text files and though this is good for just getting the information down, it’s not really something that can be shown to the outside world. For the time being, I’ll be sticking to my plain text files, but I’m seriously considering using Github Pages. For me this offers the dual benefit of easy creation in the form of text files as well having decent online output for whatever I decide to share. I lose the ability to edit from anywhere via the internet, but that’s a price I’m willing to pay. I can still use Google Docs as a emergency temporary staging area. I’m interested in learning how other developers organize their documentation and would gladly hear any advice. There’s a strong chance that my system will change in some way in the future, but that’s true of any system I might adopt.