Amazon’s Digital Wonderland

A few weeks ago I found myself in Seattle, WA. Contrary to popular belief, it was a rather bright and sunny few days (if somewhat chilly). Here’s an obligatory picture of the Sky Needle.

Sky Needle

Anyways, on the first day there I fought a mostly losing battle against travel-induced tiredness (I was up at 4:30 in the morning) and walked around downtown for a while, somewhat zombie-like. I spent the most of the next day in one of Amazon’s new buildings attending their first ever PhD Symposium. I got to meet Amazon employees like Swami Sivasubramanian, one of the creators of Amazon’s Dynamo database, as well as fellow graduate students like Rahul Potharaju. The day was full of interesting presentations and the breaks in between were packed with lots of cool conversations. I presented my current project, Merlin(excuse the visuals) and got some good feedback. All-in-all it was a great day, I had a wonderful time and I hope Amazon keeps having more of these research Symposia.

But that’s not what this post is about. Personally, I think of Amazon as a retailer first and a technology company second. In fact, I’ve even written a post about their exemplary customer service. Even though I’ve known about EC2 for years and have used both S3 and Glacier as personal backup, the idea of Amazon as a technology company has always been at the back of my mind. In fact, it was only while attending the symposium that I really thought about the full weight of Amazon as a technology services company.

After coming home I looked up the keynote from Amazon’s recent Re:Invent conference. The keynote shows off some of their more interesting recent technology (including new EC2 instances) as well as client technologies built on top of it (including companies like Netflix and Vimeo). I also stumbled across Dave Winer’s post on Amazon’s support of static JavaScript applications and why that’s so interesting and important.

The more I think about it, the more I like Amazon. They make incredible technology, employ lots of really smart people and have a refreshingly honest and direct business model in an industry dominated by advertising and harvesting user data. From computation, to storage, to scalable DNS, Amazon offers a suite of services that’s just about stunning in its breadth. Though I’ve had little use for their services personally (apart from Glacier for backup), I can see myself extensively using their systems and technology if I was building any of type of scalable, distributed service.

Even as I write this, I’m trying to come up with excuses for trying out more of their technology. What would I build? I honestly don’t know. But looking at the range of Amazon technologies and thinking about the possibilities reminds me of the feelings I got when I first started programming and learning about computers.

In many ways, the world has changed since I started writing code about 12 years ago. I had a lot of fun writing LOGO and BASIC programs and then hacking together little Perl scripts. Today I find myself wondering what the loosely coupled services and technologies offered by Amazon and other cloud computing services enable. I wonder if the new programmers of today, still learning on primarily single-threaded, single-box computing platforms, should be encouraged to move on to the brave new world of instantly accessible, practically unlimited computing power. I wonder what we’ll achieve if we were to take distributed, connected computation as the starting point, rather than the state of the art.

As an ending note, let’s think about Microsoft. It’s become standard to talk about Google as today’s Microsoft, but I’m starting to wonder if that title doesn’t rightfully belong to Amazon. I’m not talking about monopolistic activities or questionable business practices, but rather their similarities in making computing more popular. Microsoft’s goal (ostensibly) was to put a computer in every household. Amazon, for its part, has commoditized high-powered computing and distributed systems and made them available to people with modest budgets. I suppose the more things change, the more they stay the same.

Sunday Selection 2013-11-17

It’s a grey and rather dreary day here in Ithaca, NY (though it was quite bright and pleasant yesterday). I’ve been reading a lot more lately (since I essentially quit watching TV). I got through a lot this week, but I still have a lot to keep me busy on cold rainy days like this. Here’s a quick selection of my reading for this week:

Around the Web

Why GitHub is not your CV

Personally, I love open source. I love having access to use lots of great code and I enjoy contributing back and releasing my own work. However, I’m very aware that in many ways I’m exceptionally lucky: I have a stable job that lets me (in fact, encourages me) to open source and release my code to the public. At the same time, I know lots of people who don’t have the incentives or the opportunity to release open source code, even though they’re good engineers and most companies would hire them in a heartbeat. This is a long article that takes a good look at the expectations of open source development (and the role of particular services like GitHub) in the broader software culture.

Anyone can learn to be a polymath

I’ve always been intrigued by polymaths and the ideal of the Renaissance Man. While the sum of human knowledge (and capabilities) is too great for anyone to be a master of all of it, I think that we would do well to remember that specialization for insects, and that humans are very versatile, adaptable creatures. However, modern society seems to incessantly push us towards specialization and narrowness. If we are to unlock our human potential, then we have to take the initiative ourselves.

The Big Data Brain drain

Academia and industry have always had a somewhat begrudgingly respectful appreciation of each other. But what happens when the skills that are increasingly necessary to do good research and make discoveries are rewarded by industry but academia is a little slow on the uptake?

Multiplicity

Over the last weekend I played around with an interesting service called Editorially. It’s essentially a stripped down, online text editor, with support for Markdown formatting. However, it’s most attractive feature is its support for multiple versions, collaboration and editing. It’s an interesting project and it just added WordPress and Dropbox export (I wrote the first draft of this post in it and then exported to WordPress). Like many such services, I’d rather use a text editor and git to get the same effect. However, more than the service itself what interests me is something I read on their blog post announcing their export features:

On the web today, a single article may be published on the writer’s personal blog, collected in an ebook, syndicated on several magazine or news sites, and reblogged across different platforms and communities.

This notion of having the same piece of work shared in multiple places is not new, but is becoming increasingly popular, especially with the rise of of group blogs (often with guest authors), online curation, and services like Medium. Craig Mod, whom I find to be one of the most insightful writers on the intersection of technology and publishing, started one of his recent pieces with this not-quite-disclaimer:

This was originally published in Hiut Denim’s yearbook. I’ve republished it here, over on Hi and over at Medium because, well, the beauty of the web is multiplicity. More on that later.

Multiplicity. I like that word. And yes, there is something to admire in just how easy it is to copy and share on this, the modern Web. But is it a thing of beauty? I’m much less certain than Mr. Mod, especially since this form of multiplicity is heavily dependent on third-party, often proprietary services with motives that are unclear at best and
questionable at worst.

What Mr. Mod dubs “multiplicity” and call beautiful can be explained in older, cruder terms: copy-and-paste. In many ways, the current web does us a disservice — we have been taught to accept and we settled for multiplicity when what we really wanted was transclusion, first described by an early pioneer of applied computation — Ted Nelson.

Whereas multiplicity on the web takes the form of copy-and-paste, transclusion would take the form of reference. Instead of taking the literal text (and perhaps the styling and images) of a document and replicating it for each copy, we have a single canonical copy of a document (and by document I mean any information object) that can be referenced and transcluded from other places. For example, Mr. Mod could have published the original piece on his website and Hi and Medium would simply transclude it in their own versions. When someone visited the Hi or Medium pages, they would reach out and embed the original post’s content within their contexts.

Transclusion offers many advantage over copy-and-paste. For one, any changes to the original are automatically reflected wherever it is transcluded. Second, attribution becomes much easier. Instead of carefully maintaining references to where you found a particular piece of information or text, the transclusion machinery can manage it for you. In fact, such a system needs to keep proper source information to work properly. Transclusion also makes the job easier for automated systems like search engines. Instead of coming across multiple versions of the same text in different places, a crawler would simply follow back the transclusion links and be able to index the original authoritative copy.

Copy-and-paste certainly has a place, even in a hypothetical transclusion-enabled Web. One major application is of course backup and archival, which would be impossible if there was only ever a single copy. That being said, personally I would rather have transclusion than not. For one thing it would make navigating the current morass of social media and publication startups easier.

Today, if I write something (say this blog post) and want to put it online, I have to decide where to put it. I could put it on my own website, hosted on my own server, accessible at my own domain name, where I retain full control. However, maybe I want the attention generated by publishing on a platform like Medium or Tumblr. Maybe I also want to post a link and excerpt to Facebook and Twitter and Google+. Once people read it, they might want to make posts about it on their blogs and reference it. It might get picked up discussed on random forums and message boards, discussion sites and Q&A sites.

To do that today requires a lot of manual intervention and thought. First I’d have to copy and paste the text into all the different services. Then I’d have to copy-and-paste a link into the various social media services. If I wanted to add an excerpt, I’d have to do more copy-pasting and editing. If there was discussion on other places, there’s no guarantee I’d find out about it. I’d have to keep a close eye on all the discussion sites, and hope that any individuals talking about it on their own sites send me an email (or some other kind of notification) about their posts.

In a world where transclusion is the default, things become much simpler. As we’ve already discussed, the various other publishing platforms would simply transclude the content of my post. Social media services would transclude particular paragraphs (or even particular sentences). Similarly, discussion sites and other people’s blogs would transclude the particular parts of the post they want to discuss. This has a secondary benefit: I can look up the transcluders and automatically be aware of who’s talking about my post (and what parts in particular). In summary, transclusion would make sharing and discussion on the web a whole lot easier, smoother and interactive.

Unfortunately, I don’t believe that we’ll achieve transclusion any time soon. In particular, I would say most publishing and social media services have incentives to prevent transclusion — they want a unique piece of your work. Deferring to a canonical copy elsewhere that others can transclude as well is the last thing they want. That being said, we can still dream, can’t we? Perhaps, with the continuing popularity of ebooks and DIY publishing we might even start having some limited forms of transclusion. And maybe, just maybe, people like Mr. Mod and services like Editorially will start pushing for a transclusion-capable world.

Sunday Selection 2013-10-20

Around the Web

Inside GitHub’s super-lean management strategy and how it drives innovation

It’s always interesting to see how groups of people organize to do useful work, especially in the age of startups and distributed workforces. This article takes a detailed look at GitHub’s structure and how their “open allocation” strategy affects their work-style and productivity. Interestingly, it also looks at how non-product activities (like internal videos and social meetups) can be planned and executed without a strict hierarchy.

Should we stop believing Malcolm Gladwell

As a graduate student I’ve become increasingly comfortable with reading scientific papers over the last two years. As a side effect of that, I’ve become increasingly skeptical of popular science books. They’re often lacking in proper references and I’m somewhat distrusting of the layer of indirection between me and the (hopefully) rigorous scientific work. This articles focuses on Malcolm Gladwell and his particular brand of scientific storytelling. It’s been a few years since I read any of books, so I can’t comment from personal experience, but if you’re interested in knowing how much science is actually in popular science, this article is worth your time.

Scott Adams on How to be successful

I recommend this piece with a bit of caution. It’s not your typical “how to be successful” piece. There isn’t much on the lines of “find your passion” or “all your dreams will come true”. In fact, this piece is very pragmatic, very down-to-earth and just a little bit mercenary. It’s for just those reasons that I think it’s worth reading — it’s a good antidote to the cult of “follow your dreams” that seems to have become popular. There are other gems in this piece such as “goals are for losers”. If you’re looking for unconventional and refreshingly honest career advice, read this now.

Books

I’ve been cutting down on video watching in favor of more reading. This week’s recommendation is:

Getting Things Done

GTD is a bit of an obsession in the tech community, spawning an endless number of variants, apps and how-to guides. I’ve been using one of those apps for a while (OmniFocus) and I’ve been familiar with the general GTD approach, but I just started reading the book last week. Surprisingly, the book has a pretty different feel from the GTD articles and guides you’ll find around the web. David Allen doesn’t just give you organizational strategies but also takes the time to explain why particular strategies are a good idea and how they may or may not work for you. I’ve often thought that the full-blown GTD system is a bit overkill, but reading this book makes me think that at a certain level of busy-ness, it’s actually worth it. After reading this book you’ll have no doubts that GTD is a carefully thought out, well-founded system and might be worth a try even if you’re not always super-busy.

 

Sunday Selection 2013-10-13

Around the Web

Advice to a Young Programmer

I’ve learned a lot about system design and programming since I started grad school two years ago. I’m still learning a lot as I use new tools and techniques. This post does a good job of summarizing an experienced programmer’s advice to someone younger and newer to the craft.

Why Microsoft Word Must Die

I’m quite happy to say that I haven’t used Word in years. I don’t have a copy installed and I can’t remember the last time I needed to edit a Word document. I use LaTeX for most of my writing (everything from applications and academic papers to my resume). For the rare occasion that I need to open a Word document, Google Docs is more than adequate. Charlie Stross is one of my favorite newer science fiction authors and like most of his technology-related writing, this piece is on point about why the modern Microsoft Word is simply bad.

Less is Exponentially More

This article about why Go hasn’t attracted more C++ programmers is over a year old, but as a student of language design it’s interesting to see how language features interact with programmers’ needs. If you’re interested in programming languages or write lot of C++ code this is a worthwhile read.

Video

Jiro Dreams of Sushi

I’ve been meaning to watch this documentary for a long time, but finally got around to seeing it last night. It’s about Jiro Ono, and 85-year-old sushi master and owner of a tiny 3-star Michelin sushi restaurant in Japan. At its heart it’s a story of a man’s quest for perfection and devotion to his craft. Though it’s ostensibly about the art of sushi, I think there’s a lot for any professional can learn. It reflects a way of life and devotion to purpose that we rarely see in day-to-day life. You can catch it on Netflix streaming and on Amazon Instant Video (it’s not free for Prime members though).

Uncertainty about the future of programming

I finally got around to watching Bret Victor’s “The Future of Programming” talk at DBX. It did the rounds of the Intertubes about two months ago, but I was having too much at the Oregon Programming Languages Summer School to watch it (more on that later). Anyway, it’s an interesting talk and if you haven’t seen it already, here it is:

You really should watch it before we continue. I’ll wait.

Done? Great. Moving on.

While the talk generated a lot of buzz (as all of Bret Victor’s talks do), I’m not entirely sure what to take away from it. It’s interesting to see the innovations we achieved 40 years ago. It is a tad depressing to think that maybe we haven’t really progressed all that much from then (especially in terms of programmer-computer interaction). While I’m grateful to Mr. Victor for reminding us about the wonderful power of computation combined with human imagination, at the end of the talk, I left wondering: “What now?”

The talk isn’t quite a call-to-arms, but it feels like that’s what it wants to be. Victor’s 4 points about what will constitute the future of programming show us both how far we’ve come and how far we have left to go. However, like his other talks, I can’t help but wonder if he really thought through all the consequences of the points he’s making. He talks about direct manipulation of information structures, spatial and goal-directed programming and concurrent computation. His examples seem interesting and even inspiring. But how do I translate the broad strokes of Mr. Victor’s brush to the fine keystroke of my everyday work? And does that translation even make sense for more than small examples?

For my day to day to work I write compilers that generate code for network devices. While I would love to see spatial programming environments for networks and compilers, I have no idea what that would look like. If I’m building sufficiently complex systems, (like optimizing compilers or distributed data stores) the spatial representations are likely to be hundreds of pages of diagrams. Is such a representation really any easier to work with than lines of code in plain text files?

While I’m all for more powerful abstractions in general, I’m skeptical about building complicated systems with the kinds of abstractions that Bret Victor shows us. How do you patch a binary kernel image if all you have and understand is some kind of graphical representation? Can we build the complex computation that underlies graphical design systems (like Sketchpad or CAD tools) without resorting to textual code that lets us get close to the hardware? Even the Smalltalk system had significant chunks of plain text code front and center.

Plain text, for all it’s faults, has one big thing going for it — uniformity. While we might have thousands of different languages and dozens of different paradigms, models and framework, but they’re all expressed as source code. The tools of software development — editors, compilers, debuggers, profilers, are essentially similar from one language to the next. For programmers trying to build and deploy real systems I believe there is a certain benefit in such uniformity. Spatial programming systems would have to be incredibly more powerful for them to be seriously considered as an alternative.

I am doing the talk some injustice by focusing on spatial programming, but even in the other areas, I’m not quite sure what Mr. Victor is talking about. With regards to goal-directed programming, while I understand and appreciate the power of that paradigm, it’s just one among many. Concurrent and multicore is great, but a lot of modern computation is done across clusters of loosely connected machines. Where does the “cloud” fit into this vision? Mr. Victor talks about the Internet and machines connected over a network, but there’s no mention of actually computing over this network.

I believe that Mr. Victor’s point in this talk was to remind of an era of incredible innovation in computer technology, the good ideas that came out of it and how many of those ideas were never realized (and that there hasn’t really been such an era of imagination since). I can appreciate that message, but I’m still asking the question: “So what?”

Building complicated systems is a difficult job. even with plain old text files we’ve managed to do a pretty good job so far. The innovations between then and now have been less flashy but for me, at least, more helpful. Research into distributed systems, machine learning and programming languages have given us powerful languages, tools and platforms which may look mundane and boring but let us get the job fairly well. Personally I’m more interested in type systems and static analyses that let me rule out large classes of bugs at compile time than in interfaces where I connect boxes to link functionality. While I hope that Mr. Victor’s vision of more interactive and imaginative interactions with computers becomes a reality, it’s not the most important one and it’s not the future I’m working towards.

Pacific Rim is a work of art

Over the weekend I went with some fellow graduate students to see Pacific Rim. It’s not a particularly complicated movie, there are some gaping plot holes, the technobabble reaches facepalm levels and it fails the Bechdel Test. All that being said, Pacific rim was one of the most enjoyable science fiction movies I’ve seen in a long, long time. It’s much better than the current crop of superhero flicks (with possible exception of The Avengers and the Batman movies) and the last movie I liked this much was probably District Nine.

So why do I like this movie so much? It’s hard to put my finger on it, exactly. The concept is simple, but interesting: giant monsters rise out of the depths of the Pacific Ocean and humanity assembles giant robots piloted via a neural link. Crucially, these machines must be piloted by two pilots at time, leading to interesting character interactions who share the neural link. Over time, humanity grows complacent, the robot program gets scrapped and the defenses are left to rot until finally the apocalypse is nigh and only a handful of fighters stand between us and oblivion. Not the most novel premise in existence, but the magic is in the details.

The characters are at once both larger than life and fatally flawed. The imagery is classic Guillermo del Toro: beautifully detailed while being rough and gritty resulting in something that is clearly imaginary while being strangely believable. The giant robots have been neglected for years: they’re banged up, dented, rusty, constantly being repaired (think more Matrix Revolutions and less Oblivion). The last line of defense is a small, cramped base in Hong Kong. Everyone lives in cramped, mostly dirty conditions. This is not humanity’s finest hour. And with that as the background, humanity’s last line of defense is provocatively international: American, Russian, German, Australian, Chinese, Japanese and more. At the end of the day, scientists and engineers prove to be just as important as the gunslingers and military commanders. A smuggler and gangster helps put in place a core piece of the puzzle. Families are broken, important characters fall and fathers live to see their sons die. Accept the premise and forgive the technological stumblings and the movie is oddly human in comparison to your standard sci-fi flick.

And then there are the fight scenes. They are, to say the least, interesting. Though we see giant robots battling sea monsters, the battles are more martial arts than technological warfare. It’s as much about the pilots in the machines as it as about the machines themselves. Through it all, the anime influence is clear. The robots, termed Jaegers, are equipped with plasma cannons and missiles as well as swords and spinning blades. The camera angles are often imperfect and the lens is often wet or scratched. Crazy? Yes. Fun to watch? Absolutely.

In many ways,  Pacific Rim stands out because of what it is not. It’s not your run-of-the-mill action hero story, the characters and actors aren’t well known and hence open to both interpretation and evolution. You know there’s going to be an epic battle at the end, but there’s enough unknown in between to keep you from getting bored. It bears more in common with an old western than a modern superhero or scifi movie. It’s a reminder that people are still capable of coming up with an original screenplay that’s good and worth watching.

Should you go watch it? Absolutely. If you’re a scifi buff, keep in mind that it take the word “science” very liberally and definitely doesn’t take itself very seriously. If you’re not, then don’t worry, the science isn’t really a key part of the movie. The characters, their histories and interactions carry as much as the action sequences. Pacific Rim definitely takes it to my list of science fiction that I’ll recommend to people looking for something new.

(PS. If you’re interested in getting some insight in what went into the movie, this interview with Guillermo del Toro is definitely worth reading).