I’ve been interested in online backups of my data for a long time. At first I started by simply keeping my data on a free FTP server. With the rise of Web2.0 easy to use online storage, I tried a number of solutions including Box.net. I also made a backup to my Gmail account once using Gspace. When I tried web office suites from Zoho and Google, I also updated most of my documents. But time goes by, and I never really managed to stick to a single solution. I’ve also made DVD backups of my data occasionally but never with any regularity. A few months ago, I decided to drop using third party solutions and instead set up a personal Subversion server on an older Mac that I had. This system has been pretty effective at keeping my day to day to files backed up and synced across multiple machines.
While the home Subversion system does a good job of keeping backups of my most important documents, I don’t put everything in my SVN repository. Mainly, my music and my photos are mostly on a single machine (my Mac Mini). I’ve recently started looking at an efficient way to create large scale backups of all my files at less regular intervals than my SVN commits. I considered using an external hard drive, but even if I backed up all my files, I would really be using only a small amount of the space available. Plus, being a college student, I didn’t want another piece of hardware to lug around. Also, I wanted a backup that I could access no matter where I was — something online.
I would look at the Web 2.0 solutions, but none of them are cost-effective for the large amounts of data I plan on backing up. My music weighs in at around 16GB and will keep growing. My photos are about 2GB and will also keep growing. I would also like to backup copies of software discs I have paid for. There aren’t many of them, but still a few gigabytes worth. Add to that are all my regular documents and other files. All told, I’m looking at something in the range of about 25GB for a first time backup and growing over time. Considering that I’m a starving college student, the cheaper the better.
Enter Amazon Simple Storage Service
Amazon S3 is an industry standard storage solution that can easily handle many terabytes of storage and bandwidth. Whats also important is that it is very cheap. For the amount of storage I’ll be using, I’ll be spending 15 cents per GB per month and 10 cents to upload each gigabyte. Here’s a rough calculation for what my costs will be like:
Initial Backup: ($0.15 perGB * 25GB) + ($0.10GB *25GB) = $3.75 + $2.50 = $6.25
Monthly Running Cost: ($0.15 perGB * 25GB) + ($0.10GB *2GB) + ($0.17 * 1GB) = $4.12
For the running cost I estimated an upload of 2 GB a month and 1GB download. Though this is probably (especially the download) more than what I actually will be using, its should be a fairly good estimate since the amount stored will be gradually going up. So for less than the price of a regular lunch I’ll be able to keep all my important files safely backed up in a safe online location.
The catch is that Amazon S3 is not meant to be a storage solution for general users. It’s an enterprise quality system made to plug directly into a high performance online service. As a result S3 offers a fully functional API to write programs around it, but there’s no easy to use interface for users to manage their uploads. Luckily there are a number of third party tools available that fill the need. Here’s a somewhat outdated list of some available tools. The client that I’ll be using is called JungleDisk. It’s a wonderful cross-platform tool that maps your S3 storage as a storage drive on your computer. This means that you can use it as you would have storage disk attached to your machine, and you can also run scripts that automatically backup data to your S3 from other parts of your computer. JungleDisk also provides its own automation facilities to regularly backup your data. No more having to remember to backup once a month.
JungleDisk costs $20.00, but I think that’s an acceptable price, considering that you can install on as many machines as you like (including Mac, Windows and Linux systems) and you get free lifetime updates, meaning you never pay for anything again. For a dollar a month you can get the JungleDisk Plus service that lets you access your files via a web interface, allows resuming uploads if they are interrupted and lets you upload only changed parts of large files (hence saving upload costs). At this point, I don’t think I’ll have a need for Plus, but it’s a good choice if you travel a lot and plan on using S3 as your primary syncing mechanism.
Starting next month…
I’ll be backing up to S3 regularly via Jungle Disk. I plan on making the initial transfer over this weekend (while recovering from Halloween parties). Before that I need to get my files organized and decide on what I will backup and what I won’t. I’ll post a followup once I’ve been using the service for a while to see if it really is worth the cost.