Google and Wikipedia as the gatekeepers of the Internet

In January of last year, I read a post on Coding Horror about how Google was gradually becoming the starting point of the Internet. Though Jeff Atwood’s points were certainly well made and valid, I really didn’t think much of it at the time. In the year and half since then, some things have changed. Google has been moving away from pure search but still using its position as a focal point on the Internet. Chrome and Android both increase Google’s prominence in the computing world. However, I hadn’t quite realized that riding on Google’s pre-eminence is another Internet powerhouse: Wikipedia.

My story begins like this. I was doing research on IBM’s original Personal Computer and how its BIOS was reverse engineered by Compaq to produce clones. I was focusing on the ethical issues of reverse engineering and of course, I turned to Google to find online information sources. Here are some of the search queries I typed in:

  • IBM PC
  • Reverse Engineering
  • Utilitarianism
  • Kant
  • Categorical Imperative
  • Social Contract
  • Compaq

In all but the last search, the first search result was the corresponding Wikipedia article. I found it interesting and somewhat unnerving that even IBM’s own website is the second result when searching for IBM’s most famous product.

The duopoly that is beginning to form is quite interesting. Google is gradually placing itself as the chief filter and navigator for the web. Competitors like Yahoo! and MSN would require a massive, perhaps combined effort to take on Google Search in any serious way. Newcomers like the much-hyped Cuil are simply not good enough.  With Google’s multipronged effort to cut into both business computing (Google Apps) and the common man’s net experience (Chrome and Android), it’s unlikely that this trend will reverse itself anytime soon.

Wikipedia is taking the form of the most easily accessible content provider for an increasing range of not-too-specific information (and a fair amount of specific information too). How many school or college projects today don’t involve Wikipedia in some way? Even though professors and teachers might wholeheartedly (and maybe with good reason) insist that Wikipedia cannot be cited as a reputable source, the fact remains that for many students (and for many other people) research about a topic begins (and in many cases ends) with Wikipedia. Wikipedia is becoming the Wal-Mart of the Internet. Their information may not be great, but it’s good enough and it’s cheap, in terms of money, time and effort.

So What?

Well, perhaps nothing at all. After all Google hasn’t shown any signs of taking over yet. I trust Google enough to give Gmail all my email. Even my college email is routed through Gmail because it’s the most efficient solution out there. I use Google Reader to gather information from around the web. I’m a fairly regular user of Google Calendar and Google Docs. I trust that no human eyes are viewing my information or reading my documents and the software machines running on Google’s server farms do their jobs wells. Their ads are discreet and unobtrusive.

Google makes money. Lot’s of it. Billions of dollars every year. And not just for itself. There are thousands of businesses that make millions off Google Ads and many popular businesses get some 70% of their business via Google search. A lot of Google’s money goes to paying for various open source projects in a number of ways as well as funding university research. I would rather have Google in control of that cash flow than not have it at all.

However, it cannot be denied that Google is quickly and surely becoming the internet. As Jeff Atwood tells us, if your website is not on Google, it might as well not exist. Rich Skrenta, possibly the creator of the first internet virus, is not exaggerating by much when he tells us that the internet is essentially a single point marked G connected to 10 billion destination pages. If we were to follow the more conventional analysis of the Internet as a weighted graph, the weights given to Google’s outgoing links, far outnumber those given to any other (with the possible exception, in some cases, of outgoing links from Wikipedia).

And into this, Wikipedia fits perfectly. In the free world of the Internet, it’s hard for businesses to make money by selling pure information. But pure information is the heart-blood of the internet, it’s first cause for existence. And it’s this information need that is served by Wikipedia. What you can’t buy, you can get for free on Wikipedia (mostly). Wikipedia cannot surive alone. It needs efficient search to make proper use of all its information. In exchange, it acts as a sort of secondary filter: after Google’s search filters away the cruft and deadwood that litters in the internet in the form of spam, porn and obsolete webpages, Wikipedia steps in to provide a mostly reliable core from which to branch off to other points or from which to draw inspiration for more searches. Sure you can decide to not use Wikipedia. And pay the consequent price of having to sort through the mass of knowledge by yourself (though perhaps with Google search by yourself). But would you really want to?

If you want to fight Goliath …

You’ll need more than a slingshot. Google and Wikipedia are both fairly well entrenched in their respective areas. And the tasks you’ll have to accomplish to shake them are certainly Herculean, if not harder. To beat Google, you’ll have to start with an equally good algorithm. New competitors like Cuil aren’t all that bad, but they’re not good enough. With the rise of rich media, your search engine will have to be able to get to pictures, videos perhaps even Internet radio stations. Of course, now that Google has branched away from search, you’ll have to take that into account, or at least team with someone who can. People are more comfortable using a unified interface and a single way of doing things than a bunch of smaller ones. Google still has some work to do in that area. After that you need to have a way to get people money. Breaking into Google’s Ad empire might be harder than making a dent in the search market.

As for Wikipedia, you would need to find a way to collect a vast amount of information on a variety and also keep it up to date. That’s hard to do and expensive with a proprietary model and with an open system, there are problems with abusing the system. Then there’s the question of actually getting people to use your resource. Giving it away for free is no longer good enough. You’ll have to offer something that Wikipedia doesn’t. And Wikipedia offers a lot.

Neither task is for small players. So who could do it? Someone with deep pockets for one thing. The battle for the internet isn’t going to be over in a flash, it’ll be a long protracted war lasting years (if it’s ever fought, that is). Talking about Flash, Adobe and the Flash platform are also another strong player in the arena, though in a slightly different way. Adobe and Google have mostly non-overlapping interests. However a partnership between Adobe and another strong player, such as Yahoo!, Microsoft or Amazon might just tip the balance. A combination of online software built with Flash running on backends from Yahoo! or Microsoft would be a serious contender to Google’s AJAX web platforms. At the same time, it might be more beneficial for Adobe to join hands with Google, especially since Flash is already YouTube’s backbone. Tightly integrating Flash with Chrome might cement Flash’s position as the rich content platform of the Internet (with Google Ads thrown in the mix).

Of course, I’m probably getting far ahead of myself. Any serious competition to Google would involve a concerted effort by a number of interests over an extended period. That seems unlikely to happen with the current mess of competing interests, standards and technologies. For the time being at least, the Internet is still a point labeled G. The 10 billion connections are purely coincidental.

Advertisements

Published by

Shrutarshi Basu

Programmer, writer and engineer, currently working out of Cornell University in Ithaca, New York.

2 thoughts on “Google and Wikipedia as the gatekeepers of the Internet”

  1. I’ve been noting with concern much of the same thing. My main concern is that Google depends on algorithms and Wikipedia depends on administrators that always seem to be marred in some form of debacle.

    Google gives it’s version of truth dependent on page rank. Wikipedia gives it’s version of truth based on a convoluted and subjective set of rules to create ‘objective’ truth, where the truth itself can only exist if the administrators deem it is the truth… and they are subject to their own interpretations of rule of truth.

    In essence, I wonder if the Internet is becoming to be less true by virtue of the truths we cling to. 🙂

  2. Quote: “Tightly integrating Flash with Chrome might cement Flash’s position as the rich content platform of the Internet.”

    Well, considering that Mozilla has been busy integrating Adobe’s actionscript VM source into the upcomming ActionMonkey engine in the Mozilla 2 platform, and that FF already has 20% browser market share, it’s unlikely that Google can surpass Mozilla in this respect, but who knows. $0.02

    meh.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s