New search engine designed to interpret plain English queries

By Michael Liedtke

SAN FRANCISCO – The entrepreneurs gathered for an exclusive high-tech conference here Monday all hope to dazzle the crowd with their ingenuity.

But one startup, Powerset, is pursuing a particularly challenging goal: It’s aiming to outshine the Internet’s brightest star with a new search engine built to outsmart Google.

After nearly two years of hushed development, Powerset is finally providing a peek at a “natural-language” technology that is supposed to make it easier to communicate with search engines.

Powerset’s algorithms are programmed to understand search requests submitted in plain English, a change from the “keyword” system used by Google Inc., Yahoo Inc., Microsoft Corp. and the owners of the other leading engines.

The distinction means Web surfers will theoretically be able to get more meaningful results by typing more precise search requests in the form of straightforward questions like “What did Steve Jobs say about Apple?” instead of entering an ungrammatical mishmash like “Apple Steve Jobs said.”

Get The Daily Illini in your inbox!

  • Catch the latest on University of Illinois news, sports, and more. Delivered every weekday.
  • Stay up to date on all things Illini sports. Delivered every Monday.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Thank you for subscribing!

Barney Pell, Powerset’s co-founder and chief executive, likens the hit-and-miss-process of searching with keywords to talking to a 2-year-old.

“In one sense, you are happy you can talk to it (at) all, but you still really want it to grow up so you can hold a real conversation,” he said.

This isn’t the first time a search engine has tried to understand simple English, but Powerset has drawn more attention because its natural-language technology is being licensed from the Palo Alto Research Center.

Better known as PARC, the Xerox Corp. subsidiary is renowned for hatching breakthroughs – like the computer mouse and the graphical interface for personal computers – that were later commercialized by other companies.

PARC’s top natural-language specialist, Ronald Kaplan, is now Powerset’s chief technology and scientific officer.

“We have the best natural-language search technology that has ever been developed,” Pell, an artificial intelligence expert, boasted in an interview last week.

Backed by $12.5 million in venture capital, Powerset offered its first public preview Monday at the conference hosted by TechCrunch, a blog widely read by venture capitalists.

Powerset is gradually opening its testing ground, dubbed Powerlabs, to 16,000 people who signed up to get an early glimpse at the search engine. During this test phase, Powerlabs is only indexing material from Wikipedia, a popular Web encyclopedia.

The San Francisco-based startup is so confident that its methods are superior to Google that Powerlabs will present some answers alongside what its rival returns when asked the same questions. Powerset is requiring its users to vote on which engine produced better results before they are allowed to enter another search request.

“Google is the king,” Pell said. “Their system does an amazing job, given what they have to go on. But we think they have plateaued.”

Much larger companies – all relying on keyword search – haven’t been able to knock Google from its pedestal.

Despite huge investments in search by Yahoo and Microsoft, Google has steadily expanded its market share during the past three years and now processes more than half of all search requests on the Internet.

But even Google’s executives acknowledge that today’s search technology doesn’t do as good a job as it should in divining what people are looking for on the Internet.

That’s one reason Google has hired thousands of more workers and spent nearly $2.2 billion on research and development since 2005.

Other search engines have previously promised to understand conversational English with little success.

In the 1990s, Ask Jeeves was founded on the premise that Internet search requests should be presented as simple questions. It frustrated users with too many irrelevant answers.

After nearly failing in the dot-com bust, the company embraced the keyword approach to search and abandoned its mascot – a cartoon butler named Jeeves – to distance itself from the days it relied on natural-language algorithms.

It is now known simply as Ask.com.

More recently, New York-based Hakia has been tackling natural-language search requests without making much of a dent in the market.

Industry analyst Charlene Li of Forrester Research is skeptical about Powerset’s prospects, too.

She doubts Powerset will be able to comprehend all the different ways that people seeking the same type of information can phrase their questions.

For instance, the questions “What caused the collapse of Enron?” and “What caused the downfall of Enron?” typically produce different search results even though they are essentially asking the same thing, Li said. That’s because computers have trouble recognizing synonyms and other subtle nuances in language.

“Understanding the meaning of many words is difficult without people involved,” she said.