It appears 2008 might well be shaping up to be the year that semantic technology kicks off: Semantic search engine Hakia has begun licensing its technology, the intelligent organizer Twine is readying for launch, and now natural language search engine Powerset is also considering a near-term launch, as TechCrunch recently noted.

I’ve met with Powerset twice recently, and their progress even over that short timespan appears to have been considerable. A month ago two of Powerset’s founders, Barney Pell and Lorenzo Thione, were showing me how a new index (the data and rules that determine results for search engines) dramatically improved search results within Wikipedia, which Powerset uses as its testing ground.

They’re now mostly happy with the relevance of their search results and are working to build the features and interfaces that will determine how users interact with the engine. The San Francisco-based company has gone from refining its search indexing abilities to building out some fascinating tools that can parse, chop, mash-up and re-display the sentences and paragraphs that are crawled by its engine.

In our most recent meeting, Pell laid out Powerset’s new unofficial motto: “We’re not a search engine.” That’s not a surprising assertion, considering that all of the semantic startups have been trying to dodge the hyped-up “Google Killer” label since their inception. But it’s worth explaining exactly how Powerset, a company that wants to be used to search for information on the Internet, is not a search engine.

Say you’re searching on Google to learn about naval history. Here’s your problem: When Google returns its thousands of results, you actually have to go to web pages individually to see if they’re what you want. If they are, you then have to search through those pages for the information you want. If you want to know about a particular naval battle but can’t remember its name, the search could quickly become frustrating. A search for naval battles during the Civil War will be helpful, but it requires some effort to hunt through the results (try it yourself).

Powerset’s technology, however, can provide sets of results based either on entire web pages — as Google does — or on specific sections of those pages, which is helpful if they’re long, like Wikipedia entries. But where the company is headed is toward reading through pages for you and arranging or condensing the information it finds to just tell you an answer.


That means I could potentially ask Powerset, “What were the major naval battles of the Civil War?”, and immediately find in a list of results what I was looking for, the bizarre fight between the Monitor and the Merrimac. If I had to go to a page to search through the information, Powerset might have some tricks to aid in the search, pointing out sentences that seemed to be likely matches.

Figuring out the exact form those tricks will take is Powerset’s immediate challenge. I got to see a handful of tools and features, but only after swearing secrecy — in part because there’s no certainty as to what will be included with the company’s public release. Powerset faces the same challenge that Twine does or Hakia does, in that respect. Because it’s trying to create new ways of navigating the Internet, there’s no tried-and-true model to copy.

Before launching, Powerset will have to settle on a way to return data it thinks is of the most use in a search. And of course, there’s the challenge of processing millions of these requests at a time; whatever it does has to scale up for public use. Overcoming those challenges could drag out the launch date. (Some of Powerset’s team have vowed not to shave their moustaches until their public release, a project they’re calling Powerstache.)

In the meantime, Powerset will keep stressing that it’s no Google — simply because the team doesn’t want to invite derision by making grandiose claims. Yet that creates the question of whether the company could ever approach Google’s value. After all, it hasn’t yet shown that it can expand beyond Wikipedia. And what’s the point of getting excited about a company that’s basically a glorified research tool?

The answer is all in possibilities. Google is still the best way to hunt through vast numbers of silos (web pages) containing information when you’re looking for a specific fact. No new technology will seriously challenge that ability for a year or two, at least. But a technology like Powerset could short-circuit Google’s process by just giving you the damn fact, already instead of listing relevant websites.

We like Google for its speed and efficiency. We’ll like Powerset, or something similar, even more. A decade from now, we’ll look back and wonder how we survived with just Google; it’s a whole new ball game, one that will give us a whole new suite of tools indispensable for navigating the Internet. It remains to be seen how well Powerset will play into that future.

Tags:
Trackback URL

7 Trackbacks

  1. April 11th, 2008
    10:37 am

    ResourceShelf » PowerSet: Don’t Call Us a Search Engine said:

    [...] From the article: It appears 2008 might well be shaping up to be the year that semantic technology kicks off: Semantic search engine Hakia has begun licensing its technology, the intelligent organizer Twine is readying for launch, and now natural language search engine Powerset is also considering a near-term launch, as TechCrunch recently noted. [...]

  2. Alt Search Engines » Blog Archive » Can Powerset Unseat Google in Web Search? said:

    [...] true catch-22! Of course, the Powerset folks realize that this is not an easy battle to win. In a recent interview with VentureBeat, Powerset founder Barney Pell laid out Powerset’s new unofficial motto: “We’re not a search [...]

  3. Nodalities » Blog Archive » This Week’s Semantic Web said:

    [...] Powerset: Don’t call us a search engine [...]

  4. May 12th, 2008
    11:21 am

    Powerset Launches, Verdict: Meh. | 20bits said:

    [...] might be repudiating the Google-killer label now, but here’s an excerpt from a February, 2007 press release: ”The time is right to tell the [...]

  5. May 13th, 2008
    6:55 pm

    Tech, How to, Software Reviews, Linux, Dog, Make Money Online with AhTim said:

    Instant Domain Name Search Engine…

    If you still remember, I have posted 7 steps to choose a good domain name. Have you got your own first level (.com, .net, .org) domain name? I still remember when I search for my own domain name, it is not easy to find suitable one.
    Now with Domize, th…

  6. July 7th, 2008
    8:56 am

    Deal Radar 2008: Powerset - Sramana Mitra on Strategy said:

    [...] words and of the relationships between them; similar words; and categories to find better results. VentureBeat discusses how Powerset’s results focus not only on web pages but also portions of these pages [...]

4 Comments

  1. Kait said:

    so, search seems to be the only real hot app for semantic technology. Are there any other ideas out there in the VB readership fro semantic technology apps?
    Kai

  2. Michael Belanger said:

    Yes Kai,

    Semantics is about expanding graph theory to create and be the evolving index of the entire human experience. Each of our lifetime’s human experience can benefit. Semantics done right will make us “always aware” without information overload. Semantics in our evolving digital world will be providing enough new value in many dimensions that it will embrace all grass roots communities of practice (academic) and communities of interest (social). Because sub-domain experts want to be known and their wisdom to be findable, they will all develop their own digital knowledge bases of graphical representations of their rules, tangible resources, intangible resources and expectations of behavior. Highly granular abstractions (complex graphs) of each sub-domain’s inventory of knowledge objects will be exposed to semantic index crawlers that will periodically examine their graph fragments and add updates to the central meaning-based distributed index of human experience (IHE). The abstractions’ fragments within the IHE comprise abstracts of what is taught within each original sub-domain file as expressed by its own domain knowledge base. The IHE contains the fragmented essence of each domain’s current knowledge “for the world to be aware”. Each of us can benefit by many paths. One path is highly articulate natural language search. The most important will be posting highly articulated persistent queries to the IHE to detect new information you seek as soon as the semantic crawlers post it to the index. Any topic of interest, course of study or monitoring of your own heath and wellness will be “current.” The IHE is not The Semantic Web. The IHE will evolve at an accelerating rate over time the next few years.

    Michael

  3. Just a guy said:

    “We are not a search engine” but please invest in us as 1% of the search market is worth a Billion Dollars… But we are not a search engine…but come to our site when you need information from the web…but we are not a search engine.. They are right they are a hype engine. Shut up and show us what you are.

  4. May 12th, 2008
    7:32 am

    Bibokz said:

    Building search engine with the intention is to sell ads in like building a castle in the sand.

Add a Comment