Powerset opens to everyone — now, what’s next?

Today, natural language search engine Powerset is finally opening its doors to everyone, as well as unveiling a set of tools that have previously been seen by only a handful of people. This marks the end of a long phase for Powerset — since we first outed the stealth-mode company in 2006, it has kept its product under wraps for over a year and a half.

Powerset is, at the moment, essentially a search technology that has been developed into a set of tools that help dig through large amounts of written information (although it’s currently restricted to Wikipedia and Freebase). As I suggested about a month ago, Powerset’s current setup is great for research, giving users the ability to sift through lots of data without reading a great deal more than necessary.

Now I could create more written information of my own about Powerset, but a few screenshots will do a better job of showing the engine’s strengths and weaknesses. First up, here’s a shot of me asking Powerset what the hormone melatonin is. Note the amount of information that’s shown in a compact space: A summary of the hormone, what its effects on the body are, and links to several articles (including an alternate choice, in case you were looking for the music album).



The above example shows Powerset’s greatest strength: When asking it short questions, it generally “gets” what you’re looking for, or at least seems to (in reality, it doesn’t “understand” your query any more than Google would). To show off the middle portion of the page a little more, I’ve done a second query, this time on what Einstein did. Note that the output is arranged by verbs: Einstein published, developed, used, etc. I’ve clicked on “theory” under the “developed” heading, and come up with a list of his most famous theories.



Finally, here’s snippet of a Powerset result page. What Powerset has done is reach into the semantic data storage at Freebase, pull out details about Barack Obama, and arrange them in a list. Wikipedia has more or less the same information, but human editors had to take time to put it together. There’s an obvious advantage to using Powerset here:



For an example of a question that can make Powerset choke, just take a look at this: “Who was in the cast of Breakfast at Tiffany’s?” Here, Powerset can’t yet do any better than Google would, because the question is too long. For an experienced user, that’s not a problem; they’ll know how to ask the right questions. However, assuming a level of proficiency is also a weakness. One of Google’s advantages is its drop-dead simplicity; enter a single term, and it does a decent job of returning information. Most people don’t know that Google has more advanced search tools, and if they did, they likely still wouldn’t take the time to learn them.

But for someone looking for information with an eye toward finding it quickly, Powerset does a good job. Wikipedia and other information portals were built around Google’s weakness in addressing that audience — so when you search for Einstein on Google, instead of really going out and searching for a great answer, Google is likely to just return Wikipedia as the top result, simply because Wikipedia has proved itself reliable. Powerset has more potential to dig through information intelligently, and choose its results based on quality, not just relevance and reliability.

Of course, for now Powerset is returning information almost entirely from Wikipedia, which is already pretty well-structured for finding information — so most people still aren’t likely to switch away from Google. That means the company’s next step is to broaden its sources. It’ll likely start out with well-structured, large websites like the movie database imdb.com, and from there branch out to more smaller sites.

On a broader level, there’s a question of what some dark horse competitors will end up launching. One company to keep an eye on is the stealth startup Cuill, which just raised a $25 million round, bringing its total to $34 million — well over twice Powerset’s $14.5 million. Blekko is another stealth search startup that some are speculating will operate in the same space, while Hakia is busy aiming at search verticals.

However, the profusion of search sites might not be a bad thing, for once. While dozens of companies (at least) have eyed Google’s lofty perch, only to burn out without significant results, an alternate future could involve room for multiple approaches to search. An open platform that returned the results from the most applicable search engine could tie the various competitors together, as I suggested in my article on Viewzi. With several serious, differentiated contenders to Google search, such a platform could open the way for new search offerings, while still keeping Google as the leader.

By the way, there’s a rumor floating that Powerset wants to sell, because it has hired an investment bank to check around. The latest bid, says CNET, is around $100 million, from Microsoft. Here’s the thing: Powerset’s investors probably want a decent return on their money, and $100 million likely won’t cut it (although the company isn’t commenting). Maybe twice that amount would, and Microsoft does have a few extra billions of dollars lying around, but everyone will likely want to sit back and see how Powerset does in the wild first.

Bookmark and Share

Tags:

Photo of Chris Morrison

About the Author, Chris Morrison

Chris Morrison writes about cleantech and environmental issues for VentureBeat, with occasional forays into gaming and semantic technology. He got his start writing about tech for Business 2.0 magazine, but quickly realized new media was the ticket when that institution closed its doors in 2007. Chris has also covered public equities and regulatory issues. He originally hails from southern Virginia, graduated from Evergreen State College in Washington, and now lives in San Francisco.

  • asdfasdf
    Cognition.com's NLP is much further along then Powerset. Cognition's Semantic Natural Language Processing (NLP) technologies add word and phrase meaning and understanding to computer applications, providing a technology and/or end-user with actionable content based upon semantic knowledge. This understanding results in simultaneously much higher precision and recall of salient data within the universe of possible results. Cognition's Semantic NLPTM makes technologies and applications more human-like in their understanding of language, thereby resulting in more robust applications, greater user satisfaction and new capabilities available for exploitation. On the Web in particular, powering applications with Cognition's semantic understanding technology drives these applications ever closer to Web 3.0 (the semantic Web).

    Cognition - Giving technologies new meaning.TM

    Introduction
    Cognition Technologies, Inc. ("Cognition") is a next generation Semantic Natural Language Processing (NLP) company, based in Culver City, CA.

    What is Semantic NLP?

    Semantics is the sub-field of linguistics that is devoted to the study of meaning, as expressed by words, phrases, sentences, and even larger units of speech or text.
    Natural Language Processing (NLP) is a sub-field of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages by computers.
    Cognition's Semantic NLPTM is technology that "understands" word and phrase meanings within context in modern computer applications. Cognition's mission is to make its clients' technologies and applications more human-like in the understanding of language and more profitable.
    Cognition's Semantic NLP has been in development for over 23 years by Dr. Kathleen Dahlgren, Cognition's co-founder and CTO, and a team of linguists and computer scientists. Cognition's technology employs a mix of linguistics and mathematical algorithms which has, in effect, taught the computer the meanings of virtually all the words and frequent phrases within the common English language. Semantic Natural Language Processing is superior to common pattern matching that is found in most search engines and text-interaction tools because it focuses on the understanding of word and phrase meanings within context. No other commercially available natural language processing technology comes close to Cognition in its breadth and depth of understanding the English language.

    Statistics
    Cognition's Semantic NLP technology contains one of the world's largest computational dictionaries. It includes:

    506,000 Word Stems (the base forms of a word)
    536,000 Concepts
    17,000 Ambiguous Words - the most frequently used words in English language
    191,000 Phrases
    Over 4 million semantic contexts
    76,000 synonym sets
    Cognition's place in the world related to the "Semantic Web" (Web 3.0) and Google
    Cognition employs semantic technology to delve into the meaning of words and phrases, and unlike others who are trying to make the Semantic Web a reality through hand-tagging, such as Web Search, Cognition applies its Semantic NLP to other technologies to give these products and services a differentiation and competitive edge.

    "We look at what we're doing as a significant component to the Semantic Web," said Scott Jarus, Cognition's CEO, "Our focus on semantically enhancing other technologies means we're not competing with Google, Yahoo! or other consumer Search engines. Indexing the entire World Wide Web ourselves is not currently on our business roadmap. However, we might become a semantic component of someone else's application which may index deep content on the Web similar to the examples you can see on our Website."

    Management
    Scott Jarus
    Chief Executive Officer

    Scott joined Cognition Technologies in 2006 as an investor and then as its CEO. Mr. Jarus has more than 25 years of management experience in the telecommunications and Internet industries, beginning with a company that built one of the world's first public packet-data switching networks. Prior to joining the Cognition, Scott was President and chief executive of j2 Global Communications, Inc. (NASDAQ: JCOM), a profitable billion dollar market cap company whose signature product, eFax®, served more than 9.5 million customers with a local presence in more than 1,500 cities in 25 countries on 5 continents. Preceding j2 Global, Mr. Jarus was President and Chief Operating Officer for OnSite Access, the premier building-centric Integrated Communications Provider (voice, data, Internet and enhanced services) serving businesses in 22 markets throughout North America. In addition, he served in various senior management positions at RCN Telecom, Multimedia Medical Systems (which he co-founded) and Metromedia Communications.

    Mr. Jarus serves on the Board of Directors of FreeConference.com and Ironclad Performance Wear [ICPW.OB]. In 2005, Mr. Jarus was named the National Ernst & Young Entrepreneur Of The Year for Media/Entertainment/Communications (and Los Angeles Entrepreneur Of The Year for Technology in 2004). He holds a Bachelor of Arts degree in Psychology and a Master of Business Administration degree from the University of Kansas.

    Kathleen Dahlgren, PhD
    CTO / Founder

    Dr. Kathleen Dahlgren is the Founder and Chief Technology Officer of Cognition Technologies. She began her career as a professor of computational linguistics at Pitzer College of the Claremont Colleges and then worked for IBM at their Los Angeles Scientific Center, focusing on building a "natural language understanding system." Dr. Dahlgren has a Ph.D. in Linguistics and a post-doctorate in Computer Science from the University of California, Los Angeles. She has published a number of scholarly articles on the subjects of linguistics and computer science, and is the author of Naive Semantics for Natural Language Understanding. She is the co-author of Cognition's seminal patent (1998), and she received the Small Business Innovation Award from the U.S. Army in 1995. Currently, she is also an adjunct professor of Linguistics at the University of California, Los Angeles.
  • Jim
    We have no idea why they take such load of VC money. They did bunch things wrong...
    Powerset reminds me "Waterworld" the movie. They can't beat Google. Waste....