I just finished participating in a Web seminar by Stephen Wolfram about his upcoming Wolfram|Alpha search engine. Wolfram Alpha has shaken me to my intellectual core like almost nothing else I have ever seen before. Alpha isn’t really a search engine in the commonly understood sense of the word. The website calls it a “computational knowledge engine” and that is a very accurate, if not very intuitive description of what it is. You really have to see it in action to see what it does. Its defining quality is that instead of just looking across the internet for information relevant to a query, it actually attempts to compute the answers. I’ll give some examples later on to show what I mean, but for now I’ll start by walking through the seminar while it’s still fresh in my head.
The very first thing that Wolfram said (and reiterated a number of times) was that Alpha was based on Mathematica and his New Kind of Science. While this probably isn’t surprising, it’s certainly not something that most software companies or even computer science researchers would actively think about doing. However, both of them are very powerful tools and Alpha is a testament to that. Later on I had the chance to ask if cellular automata (the foundation of NKS) were a core part of Alpha. Wolfram was very emphatic that they were. He went on to say that he had always wondered what the first killer app for CA would be and that Alpha was it, even if it was a ‘prosaic’ application of NKS. He also said that NKS methodology was used heavily in the construction of Alpha. He hoped that Alpha would help people to actively create new types of science and scientific models by exploiting computational analysis.
Wolfram showed a few examples of how Alpha could be used. He used it look up data of Springfield, MA and showed how Alpha was capable of understanding queries and computing and intelligently displaying relevant data. For example searching for a chemical compound showed it’s structure and information about it’s physical and chemical properties as well as how to create it. Given a specific amount of a compound (4 molar sulphuric acid in this case) Alpha gave precise amounts of the other chemicals needed to create the given amount. Another interesting sample was when he typed in a DNA sequence and Alpha showed a possible human gene that matched it as well the relevant amino acids it encoded. That example almost blew me away.
Alpha has 4 major components:
- Data curation: Alpha doesn’t feed off the entire web but rather works off a managed database and certain trustworthy sources (Alexa and US Census info being among them). Data which does not change is managed and categorized whereas the sources are polled regularly for relevant, up-to-date information.
- Computation: 5-6 million lines of Mathematica spread across lots of parallel processors (10,000 in the production version) make up the heart of Alpha. They collectively encode a large segment of the algorithms and computer models known to man. They can be applied to theoretical problems (ie, integration, series creation, airflow simulation) or to specific data (weather prediction, tide forecasts etc).
- Linguistic components: The demonstration makes it clear that their is a very powerful (though far from perfect) natural language processing system at work. This freeform linguistic analysis is essential to Alpha because without it, a manual to make proper use of Alpha would be thousands of pages long (according to Wolfram).
- Presentation: Alpha is very pleasing to look at. The information is shown in a way that makes it very easy to get a good grasp of what’s being displayed but isn’t overwhelming at all. Though there is a standard overall format (individual data segments are arranged into ‘pods’ on the page), the actual displayed is very tailored to the specific query. It is actually simple enough for a child to use.
Wolfram has very clear and powerful ideas about what Alpha should achieve once it goes live. His main recurring theme is that it should open up computation and data analysis to everyone. Over human history we have learned to calculate, compute and algorithmically manipulate across a very wide range of topics and data. However, gaining access to these powerful tools requires considerable training and resources. Wolfram wants Alpha to let everyone become a personal scientist (that’s close to the actual words he used) just like search engines allowed everyone to have a personal reference librarian.
Alpha focuses on questions that have definite answers or that have answers that can be computed directly. In cases where there is confusion or dispute, or Alpha cannot compute sufficient answers, there will be the option of sidebar links to additional resources (like Wikipedia). Talking about Wikipedia, Alpha won’t be open for everyone to contribute to, however Wolfram said that there would be a smooth process for experts to contribute to Alpha’s knowledge base.
Talking about Alpha’s actual deployment, Wolfram said the free version would be open to everyone and would allow some amount of customization (like defining specific fields to perform specific operations). Alpha would have a set of APIs allowing data retrieval on multiple levels. Whole pages of results, or just certain sections could be obtained, as well as the underlying data and mathematical abstractions used to obtain those results. There would also be APIs to leverage the language processing infrastructure. Commercial offerings will be available where by the knowledge base could be augmented by a company’s internal information and Alpha would then apply its computational analysis to that knowledge base. I think this is going to be very useful for companies, large and small alike.
Throughout the webinar, Wolfram showed lots of examples of what Alpha could do, some of which were just plain neat and others which were awe-inspiring. Wolfram himself seemed very interested in finding out the limits of the system and would get somewhat distracted by bugs popping up when they weren’t supposed to. I suppose that’s a good thing considering how important and successful Alpha could be. He was always very good about answering questions.
My personal thoughts about Alpha are a bit hard to describe. I think it is a wonderful piece of technology that goes a long way to making computation meaningful in people’s lives. Most people use computers like glorified typewriters and record players, but Alpha might just change that. From a computer scientist’s point of view, it is a certainly a very interesting application of computer technology. The natural language processing that’s available seems considerably more capable than what is seen in most search technologies. I hope that as Alpha launches more details of its implementation come to light. As an engineer, I’d love to know more ab0ut how Alpha’s computational and data-management systems are structured and how the massive parallelism is handled. Most importantly, I hope Alpha causes a fundamental change in how computers are used and what people’s expectations of software are. Make no mistake it, Alpha is important. I won’t say it’s the best thing since sliced bread, but it could be. A lot depends on how people actually use Alpha and how open Wolfram makes it. If there is enough data made available (or if there is any easy way for people to supply their own data), I can see it becoming a powerful tool for real scientific endeavor. Here’s wishing Alpha and Wolfram the best of luck for the future.