We all know that you can get some funny/interesting responses by typing the first part of a question into a major search engine’s search box and letting it suggest the remainder. The NYT has gone so far as to investigate those suggestions themselves. I particularly enjoyed their description of search engines as “modern confessionals:”
This labor-saving device — part fortuneteller, part shrink? — has opened a window into our collective soul. With millions of people pouring their hearts into this modern-day confessional, we get a direct, if mysterious, glimpse into the heads of our fellow Web surfers.
And some nice visualizations of the questions people are asking don’t hurt, either:
I’d love to see an interactive tool for creating these diagrams.
Straight from GigaOm, emphasis mine:
Despite all the hype and excitement around the real-time web, access to real-time information online is hardly a new phenomenon. That fact stuck with me after talking to Chris Cox, Facebook’s product director, last week at the social networking company’s headquarters. As he noted, “Real time has been around since [the launch of] Technorati,” referring to the blog search engine founded by Dave Sifry in 2002 that aggregates hot stories from across the web. Yet seven years later, we still haven’t figured out how to handle the inundation of real-time information.
At the risk of redundancy, real time search isn’t the next challenge; data organization is. The upwelling of “real time” sites (the new social media, if you will) has resulted in a new form of data – non-contextual, rapidfire and fleeting. The banner, therefore, has become “we need a faster search engine” or “real time search” – but that’s not it at all. What we need is a different search engine, not a faster one.
Right now, our best search engines work by determining relevance or authority based on the number of people who indicate preference for an item, usually by linking to it. This has the obvious drawback of being dependent on links, which means it can not possibly index something earlier than it receives its first link. Thus, this paradigm doesn’t work for “real time” data.
On the other end of the spectrum, sites like Twitter determine relevance purely by time – if your tweet was published more recently, it is more important (and positioned more prominently in results) than other tweets. But this is a broken paradigm as well – it can’t possibly deliver the most relevant information unless, by construction, each published piece of information is increasingly relevant to me as time goes by!
Where does that leave us? Mostly at the whim of a few intrepid machine learning entrepreneurs. A hybrid approach is necessary at the least; a brand new approach is preferable.
And this question is worth well more than a million dollars.
Wolfram Alpha is live, albeit a little unstable. A couple times I was told “I’m sorry Dave, I’m afraid I can’t do that” and shown a live view of the W|A command center, a quiet room filled with monitors of colorful maps and charts and, yes, a dias in the back upon which Stephen Wolfram himself sat at what looked like an old CRT monitor.
It’s easy to find queries the system can’t process, but it has a surprising amount of functionality – even comparing apples and oranges (with nutritional information!)
One thing that surprised me was that the system displays projected stock paths – see here for the S&P 500 and here for IBM, AAPL and MSFT together. Seems a little bit of a gimmicky thing to feature so prominently… the paths are disclaimed as simulated log-normal random walks based on “historical parameters”, which I assume means drift/volatility.
I enjoyed the Black-Scholes page, full of graphs of all the different greeks, and a working option calculator.
Another nice feature is that clicking any graph or table will pop up a plain text version, convenient for copying data.
The full list of categories it can act on is impressive4 twenty-sided dice, or the number of homeruns hit by the Mets in 1993. I was surprised when I asked who wrote The Time Traveler’s Wife and was given data about the upcoming movie – although it does claim that August 2009 is a quarter year ago rather than in the future.
On the whole, an extremely powerful calculation engine, as promised. It remains to be seen if this becomes part of my daily use as opposed to a novelty toy/bet settler, but I have a feeling it will. Certainly, the mathematical component and historical data is exceptional; the incredible amount of additional functionality is bonus.
Around the web, early reviewers seem to be faulting the system for either 1) it’s inability to draw non-structured information from the web and 2) it’s inability to give superspecific data subsets, like zip-code level detail. It seems silly to fault a system for being “less than perfect” while it has already achieved a higher standard than any other, with much room for continued improvements. Where is the praise that the system can do what it is currently able to do? Why are we looking this gift horse in the mouth?
That said, my one hope for improvement is the site’s internal search engine – it does not have a good disambiguation system, so unless you type exactly what you mean, it’s unlikely you’ll get it. A display of topics potentially related to your input would be immensely useful.
Congratulations to the Wolfram Alpha team on a successful launch and excellent product!