Software Tailors

I came upon an article a few days ago relating software developers to tailors. The author (Patrick Rhone) wishes to have software tailors: people who will customize (or custom-make) software for you, for a price. It’s an interesting idea and not without its practical merits. In particular, given how most software today is “mass-produced” and customization options are in general, limited at best, a cottage industry of software-tailors might not be a bad idea. But for all its merits I think the idea is somewhat short-sighted. Software is sufficiently different from physical goods (including clothes) that applying the same concepts and processes to software and clothes is fundamentally flawed.

Operationally, software is much easier to change than physical matter. There’s no physical matter to change and if you make a mistake you can revert and go back to what you had. Undo is your friend. That means that even new programmers can create things of values, make mistakes and learn quickly by doing. However, conceptually, making software is as hard, if not harder than manipulating physical objects. I’m not a tailor, but I would wager that hacking on a complex piece of software requires as much skill as making alteration to a jacket. In fact, understanding million-line codebases might well be harder than tailoring a suit to custom dimensions.

The second flaw in the software tailor argument is that our software needs are far more varied than our clothing needs. Humans all have the same general body plans, you only need a handful of numbers to make a shirt fit. But our software requirements are far more varied. The text editing requirements for a novelist are different than that of a blogger which in turn are different from that of a programmer. To really understand what each person needs the programmer needs to have a pretty thorough understanding of what the problem domain is. This in turn means that the programmer either needs to be using that specalized software on a regular basis or the customer needs to able to communicate very clearly what they want. As anyone who’s written software for clients knows, getting proper requirements is often the hardest part of the project. This is why the best software is often “dogfooded” – the developers have been using what they’ve been developing. Furthermore making changes to any part of a codebase often requires understanding more than just the component you’re changing. What might be “just a few changes” to Mr. Rhone would probably end up be hours of diving into foreign source code (or at least learning an API). Writing good software is hard, writing good custom software is harder still. I don’t want to dismiss the idea of software tailors out of hand, but I want to make it clear that the job would not be like the analogy that Mr. Rhone provides.

If we can’t have neighbourhood software tailors, then what can we have? Customized software is good, because as I’ve said, people have very varied software needs that standardized software often falls short of accomodating. What I think we need is twofold: a technological shift where developers write extensible software by default and a cultural shift by which users are no longer afraid to modify their own software. Programmers (especially open source developers) are used to modifying and extending their own tools, I want to see common software users modifying their word processors and email filters.

But, but, but, does this mean we should make our own clothes and do the servicing on our cars ourselves? No, of course not. Like I said, the tailor (and the mechanic) analogy is not the right one. A better analogy is cooking. You can eat your meals at restaurants and fast food places (use standard consumer software) or you can cook your own. If you can spend an hour or two a day putting together ingredients in exact proportions and heating them at specific temperatures for specific times, you can spend half an hour a day typing some some code to make your software work the way you want it to.

Now this cultural shift is only possible if the software supports it and right now most of our software doesn’t. But that can change. Text editors like Emacs and shells like Bash and Zsh are meant to be customized and its not hard to do so once someone shows you how. Browsers are also customizable, though not quite as easily. Luckily software is malleable and with more and more people being software literate I think this is a feasible change in the not-too-distant future. Mr. Rhone wants a shift to developers making it easier for software to be extended, but those some changes could just as easily make it easier for users to adapt their software.

So with all due respect to Mr. Rhone, I don’t want a culture of software tailors. That’s not any different from the programmer-priesthood we have today. Unlike our clothes, our computers are incredibly powerful machines and we’re increasingly dependent on them in both general and very specific ways. I want a culture of citizen hackers: a generation of people who can mix and match their software just as we can develop our own dressing or cooking styles.

As programmers, our job isn’t to write code or be better craftsmen. It’s to solve problems, everything else is tangential. I believe the best way to do that is by empowering users to better solve their own problems. We fight for the users so that they can fight for themselves.

My Brain on Information

I’ve read two things recently that have made me think about and reconsider the role of information in our lives and particularly the way in which I consume and process it. We live in an information-dense era of human history. In the western world (and increasingly, the world in general) the tools to access, consume, produce and distribute vasts amounts of information are available to almost everyone at just a moments’ notice. In many ways, we are living in a Golden Age of Information. The problem is, this Golden Age first crept up on us stealthily and then rammed into us headlong at full speed. As a result I think most of us, even those considered “digital natives” (myself included), seem to be perpetually ill-equipped to deal with both the challenges and opportunities of an increasingly information-rich existence.

Last week I read Accelerando, a set of short stories by British science fiction author Charlie Stross. The stories start from the near future (almost the present) and extend to a distant post-Singularity future where humanity lives among the stars, but in the shadows of godlike intellects. Though the entire collection is worth reading (and available for free), the first few stories about a world not too different from our own were particularly interesting. At one point one of the main characters, a very intelligent serial entrepreneur (and “venture altruist”) name Manfred Macx claims to consume a megabyte of text and several gigabytes of multimedia a day just to keep current.

That’s a lot of information for any person to consume in a day – a megabyte is roughly half a million English words. Though this is science fiction, I think we’re quickly getting to the point where people who want to stay current with the pace of science and technology will be required to consume enormous amounts of information regularly. Half a million words a day may be too much for an unaugmented human (Macx has an array of cybernetic implants and software agents forming a “exocortex” for information processing) but I think tens of thousands of words a day will soon become par for the course. And that’s just text. I’m not including understanding diagrams, source code, operations manuals or even video or audio. If we’re supposed to be assimilating such huge quantities of information on a regular basis how are we supposed to make sense of it all?

That brings me to a piece on The Atlantic website dramatically titled “Is Google Making Us Stupid?“. It’s about how the use of search engines and similar fast information retrieval systems is supposedly rewiring our brains. While some parts of the piece are overly sentimental and melodramatic, the core point is sound: the tools we have access to and the way we use them plays a role in shaping the functionality of our brains. I also sympathize that a habit of continually sampling little bites of information can be deeply unsatisfying. It’s easy to get hooked on to a Facebook or Twitter stream but as you stay hooked you can feel your brainpower wither as you lose the ability to concentrate longer than 140 characters. When I get stuck on Hacker News or Reddit for hours I feel terrible by the end of the day. Though I love good stories and movies it’s easy to get hooked on Netflix, passively consuming information but not really doing anything. But I’d like to believe that we can train our brains to be not quite so helpless in the face of endless streams of juicy tidbits.

A growing body of research is showing that the human brain is an incredibly flexible organ. Neuroplasticity is the norm, not the exception. As the amount of information we need to process increases (and our tools to do so get better) our brains change to accomodate it all. That of course begs the question: how far can we push ourselves? Can we train our brains to not just flit from hyperlink to hyperlink but actually digest and understand large amounts of interconnected material with greater efficiency and accuracy? Can we ensure that Google makes us smarter and wiser, not stupider?

Though our reading habits (and by extension our general thought patterns) might be changing, the change is not accidental nor is it inevitable. Instead of bemoaning the loss of the slow reading habits of yesteryear I think we should be trying to embrace the information-dense world around us. In particular, we need to stop thinking of deep reading and skimming as antagonistic to each other. Perhaps what we need to do is not to read slower, but rather separate the physical act of reading from the mental act of comprehending what we have read. I would love to be able to read text fast, look up links and references and then let the mass of information “ferment” in my brain. I’d like to be able to train my brain to think of what I’ve read after I’m done looking at the text forming connection betweens concepts and ideas while I’m walking down the street or taking a shower.

Perhaps this is an exceedingly computer-science centric way of thinking about the brain and thought processes. To be honest, I’ve been writing code and processing data algorithmically far longer than I’ve been learning about how the brain works. I do tend to think of the brain primarily as an information processor. Unlike the author of The Atlantic article I’m not nearly as attached to the so-called “human” aspect of my intelligence (but that’s a matter for another blog post). I like settling down with a cup of coffee and a good book in a nice armchair as much as the next guy, but only on the weekends. During the week I’d like to come up with six impossible things before breakfast and figure out how to make them possible through the course of the day. To do that I need to keep the information machine fed, creativity doesn’t happen in a vacuum. I’d love to know how to do that better.

Screenplays for the Web

Yesterday I sat down to put some of my old screenplays online. Screenplays have a very specific format – monospaced fonts, fixed directions for margins, etc. Unfortunately all those rules are for paper and if there’s one thing I really don’t plan on doing, it’s distributing my writing on dead trees. But I still wanted to put my work online and have it look like a screenplay.

When I was taking my creative writing class last semester I used LaTeX to output nicely formatted PDFs to submit and I wrote directly in LaTeX. Though PDFs are great for class submissions and printing I’m very much an HTML fanboy when it comes to publishing online. Unfortunately LaTeX doesn’t seem to export directly to HTML. That’s understandable, HTML still has a way to go before it supports all the beautiful typographic nuances that LaTeX is capable of. There are some LaTeX-to-HTML converters out there, but I couldn’t them to compile on my Macbook. Instead of trying to debug the compile process I threw some regexes at the existing LaTeX source and turned it into fairly semantic hypertext.

HTML is a flexible markup language, but there was some abuse of existing HTML elements involved in coming up with a structure that worked for screenplays. Each piece of dialog becomes a section tag and I’ve really abused the header and paragraph tags. If you can come up with a more semantically “correct” interpretation, I’d love to see it. Anyways, the translation went quickly and with some CSS the result isn’t bad, in my opinion. I converted one of my shorter pieces and put it on my website, if you care to take a look. The whole process took about half an hour including fiddling with regexes and CSS.

So much for taking a LaTeX screenplay and translating it to HTML. But what about writing a screenplay for the web first? By way of inspiration, Stories and Novels is a beautiful site that features complete stories and novels in a beautiful web format (as well as Kindle editions). I’d love to see something similar for screenplays. Now admittedly, people don’t usually read screenplays the same way they read novels or stories, but who’s to say that once the trend starts it won’t pick up (and it would be a interesting experiment regardless)?

Of course, writing HTML (or any form of XML) by hand is not something I would wish on my worst enemy. It’s ok when working on a design and layout but I’d rather not write entire screenplays (or stories or novels or even blog posts) in HTML by hand. Recently, lightweight markup languages such as Markdown and Textile have become popular. They’re designed to be easily converted to HTML and they feel natural to write in. Maybe we could come up with something similar for screenplays? Sounds like an interesting weekend project, I’ll let you know how it goes on Monday.

Looking ahead

It is now just about a third of the way through the first month of the year. I’m not really one for resolutions so I didn’t make any. In fact, I didn’t do anything by way of preparing for the start of a new year. However there is one thing that I’ve been wanting to do for years that I hope to finally get around to doing: concentrate more on my writing, in particular, paying more attention to this blog.

I’ve never really wanted to be a full-time blogger, not even a technology blogger. I’ve always preferred to be someone who wrote code (or at least studied writing code) and wrote about those experiences on the side. By and large, that’s been true. However the thing is that I really like writing. It’s a good break from coding and thinking about computer science research and I enjoy communicating directly with people instead of machines for a change (which is why I refuse to pander to search engines and write SEO-directed stuff). Anyways, despite my not being a very regular writer this blog has been moving along nicely. I get around 400 hits on an average weekday and that number has been going up steadily. I’ve been on Hacker News more than once and that’s always generated a good burst of traffic.

I’ve also been discovering technologists and scientist writing interesting and very useful blogs. These are people like danah boyd (Senior Researcher at Microsoft), Matt Might (CS Professor at the University of Utah) and Andrea Kuszewski (a researcher at the George Greenstein Institute). I admire their blogs and their writing but I also admire them for being dedicated scientists and researchers. These blogs reaffirm my belief that writing on a regular basis is important (and healthy) for everyone especially if you’re involved in research and development of new technologies.

All that is a way of saying that I would like to blog more. Looking over my archives for last year I’ve only made about one post a week. Ideally I would like to increase that to two or three a week, not including the Sunday Selection link posts (I doubt I could keep up quality for anything more than that). I also want to start tackling more technical subjects. I’ve been talking a lot about the intersection of technology and productivity for a while now, but I’m starting to get a tired of the productivity aspect. Long story short, I’ve found the small set of everyday tools and environments that I need to get work done. For the foreseeable future it’s more a question of being able to stick to habits and schedules than of using the right tools. When I do speak of tools I want to give concrete examples (like my post on showing Git information in your Bash prompt) rather than handwavey suggestions.

On a related note I’ve been considering moving off WordPress.com. WordPress is great if you’re using their web-based interface but is harder to use if you live in Emacs. I’m starting to itch for a writing system that integrates well with Emacs. I’d like to be able to include my own HTML, CSS and JavaScript in my posts and be able to customize things a bit more than WordPress.com allows. I haven’t given much thought to this matter, but I’m looking at alternate systems such as Jekyll and Octopress. Whatever I decide to do I’ll probably test it out at my personal website before doing anything over here.

While this blog is definitely my most serious writing project, it’s not the only one. I took a few creative writing classes in college and enjoyed them immensely. I would like to be able to continue writing fiction (and maybe even get in shape for NaNoWriMo 2012). But for I’ll be content with just regular blogging output. Glad to have you all along for the ride.

Show Git information in your prompt

I’ve been a sworn fan of version control for a good few years now. After a brief flirtation with Subversion I am currently in a long term and very committed relationship with the Git version control system. I use Git to store all my code and writing and to keep everything in sync between my machines. Almost everything I do goes into a repository.

When I’m working I spend most of my time in three applications: a text editor (generally Emacs), a terminal (either iTerm2 or Gnome Terminal) and a browser (Firefox or Safari). When in Emacs I use the excellent Magit mode to keep track of the status of my current project repository. However my interaction with git is generally split between Emacs and the terminal. There’s no real pattern, just what’s easiest and open at the moment. Unfortunately when I’m in the terminal there’s no visible cue as to what the status of the repo is. I have to be careful to run git status regularly to see what’s going. I need to manually make sure that I’ve committed everything and pushed to the remote server. Though this isn’t usually a problem, every now and then I’ll forget to commit and push something on one of my machines, go to another and then realized I’ve left behind all my work. It’s annoying and kills productivity.

Over the last few days I decided to sit down and give my terminal a regular indicator of the state of the current repository. So without further ado, here’s how I altered my Bash prompt to show relevant Git information.

Extracting Git information

There are generally three things I’m concerned about when it comes the Git repo I’m currently working on:

  1. What is the current branch I’m on?
  2. Are there any changes that haven’t been committed?
  3. Are there local commits that haven’t been pushed upstream?

Git provides a number of tools that gives you a lot of very detailed information about the state of the repo. Those tools are just a few commands away and I don’t want to be seeing everything there is to be seen at every step. I just want the minimum information to answer the above question.

Since the bash prompt is always visible (and updated after each command) I can put a small amount of text in the prompt to give me the information I want. In particular my prompt should show:

  1. The name of the current branch
  2. A “dirty” indicator if there are files that have been changed but not committed
  3. The number of local commits that haven’t been pushed

What is the current branch?

The symbolic-ref command shows the branch that the given reference points to. Since HEAD is the symbolic reference for the current state of the working tree, we can use git symbolic-ref HEAD to get the full branch. If we were on the master branch we would get back something like refs/heads/master. We use a little Awk magic to get rid of everything but the part after the last /. Wrapping this into a litte function we get:


function git-branch-name
{
    echo $(git symbolic-ref HEAD 2>/dev/null | awk -F/ {'print $NF'})
}

Has everything been committed?

Next we want to know if the branch is dirty, i.e. if there are uncommitted changes. The git status command gives us a detailed listing of the state of the repo. For our purposes is the very last line of the output. If there are no outstanding changes it says “nothing to commit (working directory clean)”. We can isolate the last line using the Unix tail utility and if it doesn’t match the above message we print a small asterisk (*). This is just enough to tell us that there is something we need to know about the repo and should run the full git status command.

Again, wrapping this all up into a little function we have:

function git-dirty {
    st=$(git status 2>/dev/null | tail -n 1)
    if [[ $st != "nothing to commit (working directory clean)" ]]
    then
        echo "*"
    fi
}

Have all commits been pushed?

Finally we want to know if all commits to the respective remote branch. We can use the git branch -v command to get a verbose listing of all the local branches. Since we already know the name of the branch we’re on, we use grep to isolate the line that tells us about our branch of interest. If we have local commits that haven’t been pushed the status line will say something like “[ahead X]”, where X is the number of commits not pushed. We want to get that number.

Since what we’re looking for is a very well-defined pattern I decided to use BASH’s built-in regular expressions. I provide a pattern that matches =”[ahead X]” where X is a number. The matching number is stored in the BASH_REMATCH array. I can then print the number or nothing if no such match is present in the status line. The function we get is this:

function git-unpushed {
    brinfo=$(git branch -v | grep git-branch-name)
    if [[ $brinfo =~ ("[ahead "([[:digit:]]*)]) ]]
    then
        echo "(${BASH_REMATCH[2]})"
    fi
}

The =~ is the BASH regex match operator and the pattern used follows it.

Assembling the prompt

All that’s left is to tie together the functions and have them show up in the BASH prompt. I used a little function to check if the current directory is actually part of a repo. If the =git status= command only returns an error and nothing else then I’m not in a git repo and the functions I made would only give nonsense results. This functions checks the =git status= and then calls the other functions or does nothing.

function gitify {
    status=$(git status 2>/dev/null | tail -n 1)
    if [[ $status == "" ]]
    then
        echo ""
    else
        echo $(git-branch-name)$(git-dirty)$(git-unpushed)
    fi
}

Finally we could put together prompt. BASH allows for some common system information to be displayed in the prompt. I like to see the current hostname (to know which machine I’m on if I’m working over SSH) and the path to the directory I’m in. That’s what the \h and the \w are for. The Git information comes after that (if there is any) followed by a >. I also like to make use of BASH’s color support.

function make-prompt
{
    local RED="\[033[0;31m\]"
    local GREEN="\[033[0;32m\]"
    local LIGHT_GRAY="\[033[0;37m\]"
    local CYAN="\[033[0;36m\]"

    PS1="${CYAN}\h\
${GREEN} \w\
${RED} \$(gitify)\
${GREEN} >\
${LIGHT_GRAY} "

}

Conclusion

I like this prompt because it gives me just enough information at a glance. I know where I am, if any changes have been made and how much I’ve diverged from the remote copy of my work. When I’m not in a Git repo the git information is gone. It’s clean simple and informative.

I’ve borrowed heavily from both Jon Maddox and Zach Holman for some of the functionality. I didn’t come across anyone showing the commit count, but I wouldn’t be surprised if lots of other people have it too. There are probably other ways to get the same effect, this is just what I’ve found and settled on. The whole setup is available as a gist so feel free to use or fork it.