A blog for every purpose

I’ve been blogging in one form or another, on and off for about 5 years now. And in that time I’ve moved across services, URLs and styles. The ByteBaker has been my most permanent endeavor and I plan on keeping it that way. I try to keep the ByteBaker focused on technology but there is also a lot of other things that I want to share and write about. I’ve also run a tumblelog called Basu:shr:weblog, it’s a list of things that I fight interesting on the web, mostly videos and images, sometimes link and quotes.

On the 26th I’m heading off to Italy for three weeks to study art and the Renaissance. I expect it to be a great learning experience and an overall fun time. As part of the course I need to keep a pen and paper journal, but I also plan on keeping an electronic log of things I do (including photos, lots of photos). I’ve started this blog already and I’m calling it Status Quo. I’m hosting it as a free WordPress.com blog, and I’m not quite sure what to do about the photo hosting yet. WordPress already gives me 3GB of storage for media which should be sufficient for the time being but if not Flickr or SmugMug are possible choices.

It’s been a while since I wrote anything of a personal nature so this is going to be an interesting experiment. I’ll also be keeping a written, pen-and-paper journal and there will be some overlap. However since I type much faster than I can write so I expect Status Quo to be a superset of anything I put in my journal. All this is in contingent on getting access to an Internet connection. I’ll be taking my netbook and since we’re supposed to be in good hotels I hope that there will be wifi, at least in some of them. If not, I hear that there are a lot of Internet cafes in Florence, so I’ll be making regular trips (maybe not daily, but probably every other day).

Here’s to looking forward to exploring Italy and sharing the experience with anyone who cares to listen.

Will Diaspora live up to the hype?

The net is ablaze with Facebook’s privacy disaster and the Diaspora project has already drummed up over $170000 in support. And the question everyone is asking is: Is Diaspora going to be the Facebook that we all want? The answer would be complicated even if there wasn’t  money and so many high hopes involved. Going up against the incumbent is never easy, even if the incumbent is in a tight spot. Even though I love open source, I can’t help having some doubts over the Diaspora project.

An open source, community centric alternative to Facebook would be absolutely awesome, no doubt about that. But there are problems with both the idea in general and the way Diaspora is specifically implementing it. A lot of Facebook’s usefulness comes from the fact that it offers a seamless way to do a number of different things. When it comes to sharing something with friends, Facebook probably has a way to do it. You can share text, links, music, photos and videos all through Facebook. You can also send them in public (via the Wall) or in private with the private message system. An open source solution could work by tying together open protocols to support specific parts of the user experience. However, that integration has to be smooth and very well done. In fact, users should not be able to tell that there are multiple services operating underneath, instead of a single monolithic entity. If people need to sign up with five different services to do what they do with one login on Facebook, the project is dead in the water.

By allowing each user to run their own server, Diaspora is trying to make their system as open as possible. That’s a great idea, but expecting each internet user to operate their own server is not a good way to go. Opera tried that with it’s Unity project which has been pretty much a failure. Users do not want to run a server. They want to talk to their friends. I think what the Diaspora team wants to do is build on the WordPress model: the actual software is fully open source and anyone can put it on their own server and run it. But there is also WordPress.com which offers an easy-to-use setup that you can use without worrying about server administration. Diaspora can go that path, but they will have to live with the fact that the majority of users will be using a hosted solution and not running their own server. And I’m not sure if that is something they are ready to do.

There is also the problem that Jason Fried points to: the team already has a lot of money (a lot for 4 people at least) and have nothing to show for it yet. They also have had a lot of attention turned on to them and are under great pressure to deliver. I’m not saying that is necessarily a problem: I know people who thrive under pressure. But it would probably be easier if they had a smaller amount of money and could concentrate on getting things done instead of worrying about how everyone is looking at them. Without knowing the team personally, I don’t know if this is a valid concern, but it’s definitely something to keep in mind. It’s also a stark contrast to how Facebook grew: from Harvard alum to college students and then to everyone.

No one wants Diaspora to fail. And that by itself could be a problem. If Diaspora does fail, they could take all the other open source efforts down with them. And that would mean handing identity on the web to Facebook on a silver platter. Will Diaspora work? I don’t know. In cases like this I go by Torvalds’ words: Talk is cheap, show me the code. I’m going to reserve judgment until I actually see some code. I hope they succeed, I really want them to. I love how Facebook let’s me stay in touch with friends, but I hate walled gardens. However, there are issues and concerns which must be answered. So until summer ends and the Diaspora team delivers, I’m going to watch and wait. And not delete my Facebook account just yet.

What are the important problems in computer technology?

Exams are over, summer is here and it’s time to think about how I’ll be spending my time over summer and then the rest of the year. Next year is going to be my last year as a undergraduate student and I plan on working on a honors thesis. I’m really looking forward to work on it since I like doing research and being able to work independently. The problem is that I need a specific topic and I’m having a hard time focusing on one. I don’t want to do something that’s already been done, or something that’s superspecialized to death, but I don’t want to bite off more than I can choose. I’m going to have about 9 months, part time, with other courses and graduate school applications and at the end of it, I want to have something that’s a clean finished product (as cleanly finished as research products ever get).

The problem with being a undergraduate researcher is that I don’t have enough experience to know how big of a project is a good size. Again, I don’t want to think too much about this and end up limiting myself from what I can achieve. So how should I choose my project? First, I want to get my priorities straight. I’d rather do something that’s ambitious and end up with something that’s a litte incomplete than do something I can neatly finish, but is mundane and unexciting. That being said, I don’t want to do something that I have no chance of finishing. I’m not looking to prove or disprove P = NP (and I’m not a theory guy anyway) but I don’t want to write yet another Python/Ruby/Java clone.

I remember reading Paul Graham’s essay on procrastination a while ago. What stuck out to me most was his summary of Richard Hamming’s essay on research. One of the core themes on that essay is asking yourself three basic questions:

  1. What are the important problems in your field?
  2. Are you working on one of them?
  3. Why not?

Hamming was talking to professional scientists and researchers and I’m just a wee young undergraduate working on an honors thesis, but it’s never too early to get started. So, in keeping with Hamming’s excellent advice, what are the important problems in computer science? Now that’s a tricky one. Of course, there are the classic ones. Prove or disprove P = NP, build true AI, build a working quantum computer etc. etc. Then there are the ones with been-there done-that written all over them. For example, the current software stack is a bit of a mess so let’s build a unified top-to-bottom hardware-software solution all in Lisp (along with a YouTube viewer).

But seriously, what’s a real, current problem that I can take a chunk of and have a hope of solving? For starters, there’s parallel computation. Engineers are piling the core on with no sign of stopping and we have absolutely no idea what to do with all of them. Most of our software is still inherently single threaded and keeping up with concurrency with all the ills of shared memory are a pain to keep up with. New programming languages and innovations like software transactional memory are doing a valiant effort, but they still aren’t tools that you can ship off to everday programmers and have them cranking out parallel code. I’ve been studying parallel code (mostly in terms of GPUs) for last semester and it’s definitely something I can get into. However parallel computing is still a wide area and I’d need to find something more specific. I’m working on a different sort of parallelization over the summer (MapReduce with Hadoop) so that’ll give me something more to think about.

I posted to the Arch Linux forum about just this issue and one of the replies that came up multiple times was network security. The internet is everywhere and a computer without a network computer in this age is pretty much useless. And when you connect your computer to dozens of other ones spread out all acrosss the world, there’s a good chance that one or more of them don’t want to play nice. With botnets, viruses, trojans, worms and the like being a daily reality security is important and no seems to have a solution that’s even remotely bulletproof. Unfortunately for me, security just doesn’t interest me that much. I’d rather be building interesting stuff than holding down the fort. So on to the next one.

I’m interested in the way that people interact with their machines, not really in the up-front, HCI sense, but in a more general sense. One problem that I find interesting is how we can manage all the data we have on our machines without having to explicitly name (and then remember the name of) each and every single piece. The filesystem metaphor does not scale to thousands of songs and images or to the many tiny bits of text and video that we’re continually sharing with each other. Unfortunately, none of the popular operating systems seem to be meeting this challenge head on (iLife is going in the right direction, but the ideas need to spread to the rest of the system). I’d love to come up with an implementation of a file-less, data centric user model, probably along the lines to what the Etoile project is trying to accomplish. This is certainly something that I’ll be thinking about more (and I need to come up with a more theoretical basis for an honors project).

There are probably a bunch of other things that could qualify for being among the most important problems in computer science and I could add another few thousand words to this post. But sufficeth to say that this is still something that I need to give a lot more thought to. I still have no idea what my thesis topic could be, but I know what I’ll be doing over summer. I’m not going to be solving the multicore crisis or building a completely new operating system from scratch, but I’m definitely going to be taking my best crack at it.


Comments need to be in blog order

Most blogs on the web are in reverse chronological order — the most recent article shows up first. I think this works pretty well for a reader, because you get to see the most recent, current state of the blog and if you like what you can see then you can easily dig deeper. Also, you can easily see if a blog has been inactive for a long time and move on if you don’t care about the back-issues. I’m going to be calling this ordering “blog order” for the rest of the post.

Comments on blogs are typically in the opposite order. The first comment you see will be the oldest and the most recent comment will be at the end. In some ways this makes sense. The comments are generally conversations that people are having about the post. It makes sense to have the comments in chronological order so that people can follow the conversation as it progresses and so that people read what other people have written before commenting themselves. However, this doesn’t really scale beyond a few dozen comments. There are going to be very few people who will care to follow a conversation that spans a hundred comments. The majority of people care more about expressing what they have to say than about reading in depth about the rest of the conversation. Of course this means that there is going to be repetition because these people haven’t read what’s written before.

For me there is another aspect that I think is worse than some repetition. Seeing a line of a few hundred comments I often decide not to write a comment at all. After all, my comment is going straight to the end of the line and how many people (apart from the original post author) are seriously going to take the time to go that deep to read my comment? As a writer of blog I think that is a serious issue because I don’t people staying away from from commenting because they think no one is going to read their comments.

What if blog comments were also in blog order? That way, the most recent comments are on top and readers get encouraged to write their thoughts. But this makes the repetition problem even more serious and also threatens the conversation nature of the comment system. You can’t really get in on a conversation unless you know what has been talked about before. Putting comments in blog order makes it easy to miss previous conversations and again reduces the effectiveness of comments as community builders.

Instead of making the comment the unit of organization, what if we shift focus to the conversation thread? Let’s take the most recent conversation thread and bump it to the top. Inside each thread, the individual comments are still in chronological order. This ordering has two important characteristics:

  1. The newest, most active conversations rise to the top. Readers can see what others are talking about and join right in. And if there aren’t and long conversations currently going on, the latest comment is on top helping individual comments get read.
  2. Each conversation is chronological encouraging readers to read through what’s been said before adding their own contribution.

The idea to emphasize the conversation over the discrete message isn’t new: it’s the main distinction that separates forums from email. It’s also the reason Gmail is so awesome: excellent support for threaded conversations. There are of course challenges to be addressed, especially in terms of UI and how to deal with threads that encapsulate divergent conversations. But these problems will only get addressed as they become more common. I think it’s about time that the web moved from the flat chronological comment system that is so popular to a richer, more useful one that plays an active role in fostering conversations and community.