Exams are over, summer is here and it’s time to think about how I’ll be spending my time over summer and then the rest of the year. Next year is going to be my last year as a undergraduate student and I plan on working on a honors thesis. I’m really looking forward to work on it since I like doing research and being able to work independently. The problem is that I need a specific topic and I’m having a hard time focusing on one. I don’t want to do something that’s already been done, or something that’s superspecialized to death, but I don’t want to bite off more than I can choose. I’m going to have about 9 months, part time, with other courses and graduate school applications and at the end of it, I want to have something that’s a clean finished product (as cleanly finished as research products ever get).
The problem with being a undergraduate researcher is that I don’t have enough experience to know how big of a project is a good size. Again, I don’t want to think too much about this and end up limiting myself from what I can achieve. So how should I choose my project? First, I want to get my priorities straight. I’d rather do something that’s ambitious and end up with something that’s a litte incomplete than do something I can neatly finish, but is mundane and unexciting. That being said, I don’t want to do something that I have no chance of finishing. I’m not looking to prove or disprove P = NP (and I’m not a theory guy anyway) but I don’t want to write yet another Python/Ruby/Java clone.
I remember reading Paul Graham’s essay on procrastination a while ago. What stuck out to me most was his summary of Richard Hamming’s essay on research. One of the core themes on that essay is asking yourself three basic questions:
- What are the important problems in your field?
- Are you working on one of them?
- Why not?
Hamming was talking to professional scientists and researchers and I’m just a wee young undergraduate working on an honors thesis, but it’s never too early to get started. So, in keeping with Hamming’s excellent advice, what are the important problems in computer science? Now that’s a tricky one. Of course, there are the classic ones. Prove or disprove P = NP, build true AI, build a working quantum computer etc. etc. Then there are the ones with been-there done-that written all over them. For example, the current software stack is a bit of a mess so let’s build a unified top-to-bottom hardware-software solution all in Lisp (along with a YouTube viewer).
But seriously, what’s a real, current problem that I can take a chunk of and have a hope of solving? For starters, there’s parallel computation. Engineers are piling the core on with no sign of stopping and we have absolutely no idea what to do with all of them. Most of our software is still inherently single threaded and keeping up with concurrency with all the ills of shared memory are a pain to keep up with. New programming languages and innovations like software transactional memory are doing a valiant effort, but they still aren’t tools that you can ship off to everday programmers and have them cranking out parallel code. I’ve been studying parallel code (mostly in terms of GPUs) for last semester and it’s definitely something I can get into. However parallel computing is still a wide area and I’d need to find something more specific. I’m working on a different sort of parallelization over the summer (MapReduce with Hadoop) so that’ll give me something more to think about.
I posted to the Arch Linux forum about just this issue and one of the replies that came up multiple times was network security. The internet is everywhere and a computer without a network computer in this age is pretty much useless. And when you connect your computer to dozens of other ones spread out all acrosss the world, there’s a good chance that one or more of them don’t want to play nice. With botnets, viruses, trojans, worms and the like being a daily reality security is important and no seems to have a solution that’s even remotely bulletproof. Unfortunately for me, security just doesn’t interest me that much. I’d rather be building interesting stuff than holding down the fort. So on to the next one.
I’m interested in the way that people interact with their machines, not really in the up-front, HCI sense, but in a more general sense. One problem that I find interesting is how we can manage all the data we have on our machines without having to explicitly name (and then remember the name of) each and every single piece. The filesystem metaphor does not scale to thousands of songs and images or to the many tiny bits of text and video that we’re continually sharing with each other. Unfortunately, none of the popular operating systems seem to be meeting this challenge head on (iLife is going in the right direction, but the ideas need to spread to the rest of the system). I’d love to come up with an implementation of a file-less, data centric user model, probably along the lines to what the Etoile project is trying to accomplish. This is certainly something that I’ll be thinking about more (and I need to come up with a more theoretical basis for an honors project).
There are probably a bunch of other things that could qualify for being among the most important problems in computer science and I could add another few thousand words to this post. But sufficeth to say that this is still something that I need to give a lot more thought to. I still have no idea what my thesis topic could be, but I know what I’ll be doing over summer. I’m not going to be solving the multicore crisis or building a completely new operating system from scratch, but I’m definitely going to be taking my best crack at it.