Amazon’s Digital Wonderland

A few weeks ago I found myself in Seattle, WA. Contrary to popular belief, it was a rather bright and sunny few days (if somewhat chilly). Here’s an obligatory picture of the Sky Needle.

Sky Needle

Anyways, on the first day there I fought a mostly losing battle against travel-induced tiredness (I was up at 4:30 in the morning) and walked around downtown for a while, somewhat zombie-like. I spent the most of the next day in one of Amazon’s new buildings attending their first ever PhD Symposium. I got to meet Amazon employees like Swami Sivasubramanian, one of the creators of Amazon’s Dynamo database, as well as fellow graduate students like Rahul Potharaju. The day was full of interesting presentations and the breaks in between were packed with lots of cool conversations. I presented my current project, Merlin(excuse the visuals) and got some good feedback. All-in-all it was a great day, I had a wonderful time and I hope Amazon keeps having more of these research Symposia.

But that’s not what this post is about. Personally, I think of Amazon as a retailer first and a technology company second. In fact, I’ve even written a post about their exemplary customer service. Even though I’ve known about EC2 for years and have used both S3 and Glacier as personal backup, the idea of Amazon as a technology company has always been at the back of my mind. In fact, it was only while attending the symposium that I really thought about the full weight of Amazon as a technology services company.

After coming home I looked up the keynote from Amazon’s recent Re:Invent conference. The keynote shows off some of their more interesting recent technology (including new EC2 instances) as well as client technologies built on top of it (including companies like Netflix and Vimeo). I also stumbled across Dave Winer’s post on Amazon’s support of static JavaScript applications and why that’s so interesting and important.

The more I think about it, the more I like Amazon. They make incredible technology, employ lots of really smart people and have a refreshingly honest and direct business model in an industry dominated by advertising and harvesting user data. From computation, to storage, to scalable DNS, Amazon offers a suite of services that’s just about stunning in its breadth. Though I’ve had little use for their services personally (apart from Glacier for backup), I can see myself extensively using their systems and technology if I was building any of type of scalable, distributed service.

Even as I write this, I’m trying to come up with excuses for trying out more of their technology. What would I build? I honestly don’t know. But looking at the range of Amazon technologies and thinking about the possibilities reminds me of the feelings I got when I first started programming and learning about computers.

In many ways, the world has changed since I started writing code about 12 years ago. I had a lot of fun writing LOGO and BASIC programs and then hacking together little Perl scripts. Today I find myself wondering what the loosely coupled services and technologies offered by Amazon and other cloud computing services enable. I wonder if the new programmers of today, still learning on primarily single-threaded, single-box computing platforms, should be encouraged to move on to the brave new world of instantly accessible, practically unlimited computing power. I wonder what we’ll achieve if we were to take distributed, connected computation as the starting point, rather than the state of the art.

As an ending note, let’s think about Microsoft. It’s become standard to talk about Google as today’s Microsoft, but I’m starting to wonder if that title doesn’t rightfully belong to Amazon. I’m not talking about monopolistic activities or questionable business practices, but rather their similarities in making computing more popular. Microsoft’s goal (ostensibly) was to put a computer in every household. Amazon, for its part, has commoditized high-powered computing and distributed systems and made them available to people with modest budgets. I suppose the more things change, the more they stay the same.

Sunday Selection 2013-11-17

It’s a grey and rather dreary day here in Ithaca, NY (though it was quite bright and pleasant yesterday). I’ve been reading a lot more lately (since I essentially quit watching TV). I got through a lot this week, but I still have a lot to keep me busy on cold rainy days like this. Here’s a quick selection of my reading for this week:

Around the Web

Why GitHub is not your CV

Personally, I love open source. I love having access to use lots of great code and I enjoy contributing back and releasing my own work. However, I’m very aware that in many ways I’m exceptionally lucky: I have a stable job that lets me (in fact, encourages me) to open source and release my code to the public. At the same time, I know lots of people who don’t have the incentives or the opportunity to release open source code, even though they’re good engineers and most companies would hire them in a heartbeat. This is a long article that takes a good look at the expectations of open source development (and the role of particular services like GitHub) in the broader software culture.

Anyone can learn to be a polymath

I’ve always been intrigued by polymaths and the ideal of the Renaissance Man. While the sum of human knowledge (and capabilities) is too great for anyone to be a master of all of it, I think that we would do well to remember that specialization for insects, and that humans are very versatile, adaptable creatures. However, modern society seems to incessantly push us towards specialization and narrowness. If we are to unlock our human potential, then we have to take the initiative ourselves.

The Big Data Brain drain

Academia and industry have always had a somewhat begrudgingly respectful appreciation of each other. But what happens when the skills that are increasingly necessary to do good research and make discoveries are rewarded by industry but academia is a little slow on the uptake?

Multiplicity

Over the last weekend I played around with an interesting service called Editorially. It’s essentially a stripped down, online text editor, with support for Markdown formatting. However, it’s most attractive feature is its support for multiple versions, collaboration and editing. It’s an interesting project and it just added WordPress and Dropbox export (I wrote the first draft of this post in it and then exported to WordPress). Like many such services, I’d rather use a text editor and git to get the same effect. However, more than the service itself what interests me is something I read on their blog post announcing their export features:

On the web today, a single article may be published on the writer’s personal blog, collected in an ebook, syndicated on several magazine or news sites, and reblogged across different platforms and communities.

This notion of having the same piece of work shared in multiple places is not new, but is becoming increasingly popular, especially with the rise of of group blogs (often with guest authors), online curation, and services like Medium. Craig Mod, whom I find to be one of the most insightful writers on the intersection of technology and publishing, started one of his recent pieces with this not-quite-disclaimer:

This was originally published in Hiut Denim’s yearbook. I’ve republished it here, over on Hi and over at Medium because, well, the beauty of the web is multiplicity. More on that later.

Multiplicity. I like that word. And yes, there is something to admire in just how easy it is to copy and share on this, the modern Web. But is it a thing of beauty? I’m much less certain than Mr. Mod, especially since this form of multiplicity is heavily dependent on third-party, often proprietary services with motives that are unclear at best and
questionable at worst.

What Mr. Mod dubs “multiplicity” and call beautiful can be explained in older, cruder terms: copy-and-paste. In many ways, the current web does us a disservice — we have been taught to accept and we settled for multiplicity when what we really wanted was transclusion, first described by an early pioneer of applied computation — Ted Nelson.

Whereas multiplicity on the web takes the form of copy-and-paste, transclusion would take the form of reference. Instead of taking the literal text (and perhaps the styling and images) of a document and replicating it for each copy, we have a single canonical copy of a document (and by document I mean any information object) that can be referenced and transcluded from other places. For example, Mr. Mod could have published the original piece on his website and Hi and Medium would simply transclude it in their own versions. When someone visited the Hi or Medium pages, they would reach out and embed the original post’s content within their contexts.

Transclusion offers many advantage over copy-and-paste. For one, any changes to the original are automatically reflected wherever it is transcluded. Second, attribution becomes much easier. Instead of carefully maintaining references to where you found a particular piece of information or text, the transclusion machinery can manage it for you. In fact, such a system needs to keep proper source information to work properly. Transclusion also makes the job easier for automated systems like search engines. Instead of coming across multiple versions of the same text in different places, a crawler would simply follow back the transclusion links and be able to index the original authoritative copy.

Copy-and-paste certainly has a place, even in a hypothetical transclusion-enabled Web. One major application is of course backup and archival, which would be impossible if there was only ever a single copy. That being said, personally I would rather have transclusion than not. For one thing it would make navigating the current morass of social media and publication startups easier.

Today, if I write something (say this blog post) and want to put it online, I have to decide where to put it. I could put it on my own website, hosted on my own server, accessible at my own domain name, where I retain full control. However, maybe I want the attention generated by publishing on a platform like Medium or Tumblr. Maybe I also want to post a link and excerpt to Facebook and Twitter and Google+. Once people read it, they might want to make posts about it on their blogs and reference it. It might get picked up discussed on random forums and message boards, discussion sites and Q&A sites.

To do that today requires a lot of manual intervention and thought. First I’d have to copy and paste the text into all the different services. Then I’d have to copy-and-paste a link into the various social media services. If I wanted to add an excerpt, I’d have to do more copy-pasting and editing. If there was discussion on other places, there’s no guarantee I’d find out about it. I’d have to keep a close eye on all the discussion sites, and hope that any individuals talking about it on their own sites send me an email (or some other kind of notification) about their posts.

In a world where transclusion is the default, things become much simpler. As we’ve already discussed, the various other publishing platforms would simply transclude the content of my post. Social media services would transclude particular paragraphs (or even particular sentences). Similarly, discussion sites and other people’s blogs would transclude the particular parts of the post they want to discuss. This has a secondary benefit: I can look up the transcluders and automatically be aware of who’s talking about my post (and what parts in particular). In summary, transclusion would make sharing and discussion on the web a whole lot easier, smoother and interactive.

Unfortunately, I don’t believe that we’ll achieve transclusion any time soon. In particular, I would say most publishing and social media services have incentives to prevent transclusion — they want a unique piece of your work. Deferring to a canonical copy elsewhere that others can transclude as well is the last thing they want. That being said, we can still dream, can’t we? Perhaps, with the continuing popularity of ebooks and DIY publishing we might even start having some limited forms of transclusion. And maybe, just maybe, people like Mr. Mod and services like Editorially will start pushing for a transclusion-capable world.

Sunday Selection 2013-10-20

Around the Web

Inside GitHub’s super-lean management strategy and how it drives innovation

It’s always interesting to see how groups of people organize to do useful work, especially in the age of startups and distributed workforces. This article takes a detailed look at GitHub’s structure and how their “open allocation” strategy affects their work-style and productivity. Interestingly, it also looks at how non-product activities (like internal videos and social meetups) can be planned and executed without a strict hierarchy.

Should we stop believing Malcolm Gladwell

As a graduate student I’ve become increasingly comfortable with reading scientific papers over the last two years. As a side effect of that, I’ve become increasingly skeptical of popular science books. They’re often lacking in proper references and I’m somewhat distrusting of the layer of indirection between me and the (hopefully) rigorous scientific work. This articles focuses on Malcolm Gladwell and his particular brand of scientific storytelling. It’s been a few years since I read any of books, so I can’t comment from personal experience, but if you’re interested in knowing how much science is actually in popular science, this article is worth your time.

Scott Adams on How to be successful

I recommend this piece with a bit of caution. It’s not your typical “how to be successful” piece. There isn’t much on the lines of “find your passion” or “all your dreams will come true”. In fact, this piece is very pragmatic, very down-to-earth and just a little bit mercenary. It’s for just those reasons that I think it’s worth reading — it’s a good antidote to the cult of “follow your dreams” that seems to have become popular. There are other gems in this piece such as “goals are for losers”. If you’re looking for unconventional and refreshingly honest career advice, read this now.

Books

I’ve been cutting down on video watching in favor of more reading. This week’s recommendation is:

Getting Things Done

GTD is a bit of an obsession in the tech community, spawning an endless number of variants, apps and how-to guides. I’ve been using one of those apps for a while (OmniFocus) and I’ve been familiar with the general GTD approach, but I just started reading the book last week. Surprisingly, the book has a pretty different feel from the GTD articles and guides you’ll find around the web. David Allen doesn’t just give you organizational strategies but also takes the time to explain why particular strategies are a good idea and how they may or may not work for you. I’ve often thought that the full-blown GTD system is a bit overkill, but reading this book makes me think that at a certain level of busy-ness, it’s actually worth it. After reading this book you’ll have no doubts that GTD is a carefully thought out, well-founded system and might be worth a try even if you’re not always super-busy.

 

Sunday Selection 2013-10-13

Around the Web

Advice to a Young Programmer

I’ve learned a lot about system design and programming since I started grad school two years ago. I’m still learning a lot as I use new tools and techniques. This post does a good job of summarizing an experienced programmer’s advice to someone younger and newer to the craft.

Why Microsoft Word Must Die

I’m quite happy to say that I haven’t used Word in years. I don’t have a copy installed and I can’t remember the last time I needed to edit a Word document. I use LaTeX for most of my writing (everything from applications and academic papers to my resume). For the rare occasion that I need to open a Word document, Google Docs is more than adequate. Charlie Stross is one of my favorite newer science fiction authors and like most of his technology-related writing, this piece is on point about why the modern Microsoft Word is simply bad.

Less is Exponentially More

This article about why Go hasn’t attracted more C++ programmers is over a year old, but as a student of language design it’s interesting to see how language features interact with programmers’ needs. If you’re interested in programming languages or write lot of C++ code this is a worthwhile read.

Video

Jiro Dreams of Sushi

I’ve been meaning to watch this documentary for a long time, but finally got around to seeing it last night. It’s about Jiro Ono, and 85-year-old sushi master and owner of a tiny 3-star Michelin sushi restaurant in Japan. At its heart it’s a story of a man’s quest for perfection and devotion to his craft. Though it’s ostensibly about the art of sushi, I think there’s a lot for any professional can learn. It reflects a way of life and devotion to purpose that we rarely see in day-to-day life. You can catch it on Netflix streaming and on Amazon Instant Video (it’s not free for Prime members though).