A partial solution to the format problem

One of my pet peeves is the format problem: there are too many mutually incompatible formats out there for you to choose from if you have some data you want to store and present to a large group of people. Since I don’t use sound or video (or even images) that much, the subset of this problem that I’m concerned with relates to text data.

The problem goes something like this: suppose I have a large-ish chunk of text that I want to send out to a lot of people, how should I do it? First off, we’re going to assume that we will be composing the text electronically on a computer, but we will have to hand it out both electronically and in print. Second, our goal is to have as few distinct copies of the text as possible so that we don’t have to go about editing a bunch of different files if we make a change. For the electronic copies, the end viewers should not have to install any special software to see what we send them.

The most common solution that the average computer user would use is to make a Word document. Like it or not, Word is a mostly universal format for exchanging documents. The chances that someone does not have Word installed is really quite slim. However, just because it’s popular doesn’t mean it’s good. Considering that I’m a Linux and Open Source enthusiast (and so are a lot of the people I communicate with) I can’t depend on them having Word and I feel slightly guilty about using something proprietary. Also, even if Word was free there are excellent arguments to made against using word processors in general and I agree with them.

Personally, I haven’t used Word for serious document creation for about 3 years now. However, the alternatives aren’t very easy to come up with. It’s taken me a few years, but I’ve finally come to a system that I can use full time. The main realization for me was that I create two main types of text documents: ones that will printed and given out to professors and other students, and electronic documents that I put on the web and I will not print myself. For a long time, I wanted a way to be able to do both in a single shot: I wanted a format where I could create good looking webpages as well as pretty print output. Unfortunately, I haven’t yet found anything that is quite so easy to use.

Instead I’ve settled for two partial solutions. For printing, the good news is that Latex is really the state of the art when it comes to preparing good looking print documents. I’ve never made documents in Word that looked as carefully put together as a Latex document. Sure there’s a learning curve, but it’s one that I’ll happily live with in exchange for great looking documents.

Technically Latex can be converted to HTML. But I’ve never done this because I’ve never really run a website in pure HTML In the old days, I used Dreamweaver and as I started blogging I used Blogger and now WordPress. It’s only recently that I started to keep a small site static in HTML. In the process of making this site, I’ve realized that the set of writings I print and distribute and the ones I put online, are mostly independent of each other. On my website for example I have a page about my computers. That’s something I can’t see myself printing to give to someone else. At the same time, I’ve done a lot of writing this semester that I haven’t put online because they are in a state of flux and I’m not ready to share them with the world at large yet.

it doesn’t make sense me to write a lot of Latex source for something that I could write using a plain-text markup and then convert to HTML. Also the advanced typesetting used by Latex is lost in webpages. Lately I’ve started using Markdown to write my webpages in plain text and then generate HTML which I can insert into templates. This is a very simple solution where the source is easy to read and create but the output looks good and is simple to generate.

What I’ve settled on is to use both Latex and Markdown/HTML, but independently of each other. Things that need to get printed, such as papers and later my theses, will be typeset in Latex where I’m guaranteed to have good looking output. Anything else that I put online (such as my site) will be Markdown  automatically converted to HTML.

Of course, this isn’t a complete solution at all. One the one hand, some of the things I write in Latex, I might want to put online someday. Then there are things like this blog that are neither Markdown nor Latex and that is something I haven’t talked about yet. But it’s a good start and it’s something that I find fairly comfortable using right now.

3 thoughts on “A partial solution to the format problem

  1. Hi Shrutarshi,

    have you already looked at MultiMarkdown (http://fletcherpenney.net/multimarkdown/)?

    A quote from the FAQ: “One of my goals with MultiMarkdown was to make it even easier to create a LaTeX document, with minimal knowledge of the LaTeX syntax. In fact, you can create fairly complex documents without any understanding of how LaTeX works, as long as you have it installed correctly.”

    In essence it’s an extension to Markdown with some nice additions.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s