lu_webQi Lu, who left Yahoo after a decade to run Microsoft’s Online Services Division, is talking on-stage at the Web 2.0 Summit in San Francisco today about Bing’s just-announced Twitter-search capability.

Here’s the core news:

  • Microsoft has a non-exclusive deal with Twitter to incorporate all the public information in the Twitter stream into search results in real-time. It’s going live shortly. They’re calling it “Bing Wave 2.” Financial terms are not disclosed.
  • Microsoft also has a deal with Facebook (but there are fewer details here).
  • Lu says he thinks about search “holistically.” Search used to be about crawling web pages and making sites accessible to the user. But now it’s really become about evaluating a user’s intent over time computationally. Tim O’Reilly, who is hosting the conference, and other audience members keep referring to this ambition as building a “mind-reader.” But Lu chooses not to use these words.

Here are more details of how Bing’s Twitter search will look: It can show tweets by how recently they’ve been published, but Bing can also show the most popular tweets and collapse the copycat tweets into the result.

They’ll assign higher result rankings to tweets that have been retweeted more or that contain a link to another information source. The search engine will also show a tag cloud of trending topics.

It can also pull out the most popular embedded links that are into tweets. Bing also pulls out any full URLs that have been compressed into Bit.ly or other shortened links. (URL shorteners have been a point of concern because they hide the destination and are an easy target for spammers or links to malicious code.)

And here’s a transcript of the conversation (as best as I can type):

Lu, on why he left Yahoo: The main reason I took that was the opportunity was to have larger, enduring impact. That’s been my career motivation. We launched Bing over five months ago and have seen some good traction.

We think about search holistically. We believe search in a broader sense is about computationally understanding user intent. User intent means the purpose the user is trying to accomplish — and incorporates their interests and needs.

One part is explicit. Nobody forces you type anything. But intent can also be implicit and it can be less specific and it can be latent or recurrent. We need to think about the form of capturing and understanding intent computationally to fulfill those needs. Today’s form of search is a box. We are systematically pursuing really rising to a level of understanding user intent.

Tim O’Reilly: Your overall goal is to get better than anybody else in understanding user intent?

Lu: In the beginning, you had a bunch of web sites linking to each other. In the early days of search, there was this idea of navigational intent — people were looking for a site and were trying to remember the URL. At the time I was at IBM research lab, you’d look at the anchor text and links. We’ve pretty much solved that need. By and large, you can find the sites.

O’Reilly: So you think there’s a lot of opportunity still? You’ve got some good momentum out of the blocks, so what’s next?

Lu: If you look at the history — because I think it’s important — a few years later, there was commercial intent. People are looking for things to buy. Instead of crawling and indexing the web, why not create a marketplace? Today we have large repository of imagery with sites like Flickr. It’s far better to use images to fulfill that intent. And then we have YouTube. We just need to build in more technologies to tap the richness of the web repositories. Then we have things like Facebook and Twitter. Particularly, things like Twitter. It’s an emerging communications platform. The salient features are still evolving. In its early stages, you can see vibrance in it. It will enable people to find information.

(Microsoft now demos Bing 2 with Twitter.)

O’Reilly: Is this a windfall for Twitter? Is this their business model?

Lu: We do not disclose financial terms.

O’Reilly: What’s the length of time on the deal?

Lu: I do not know the specific terms. This is a start. There will be more opportunities between us and Twitter and other parties.

O’Reilly: If your goal is to really build a mind-reader, you’re going to have to know a lot more about us. How do you think about the privacy concerns?

Lu: Privacy is one very serious problem we need to think about as an industry as a whole. There are a few key principles we need to have — full disclosure. More importantly, the purpose of having the data and building the model is to create value for the user. The beauty is you can actually model user intent by looking at the web corpus. If you’re doing web search for a fully developed economy like the U.S., there’s a lot of data and vibrancy. How do we need to know a person’s commercial intent? You can look at the web corpus.

Jeremiah Owyang: Will this real-time Twitter feed influence actual Bing results?

Lu: A great question. There’s a couple of things — one is a real-time corpus contains a lot of important velocity signals. Particularly when things are trending up, you can use those signals for a variety of purposes. You can use these to augment today’s search experience.

I’m not a heavy user of Twitter, but my daughter goes to Saratoga High and there are people who are tweeting about sport events. Now I can find that information. Anytime you fundamentally lower the cost of consuming and producing information, you open up so many possibilities. Search can unlock a lot of that value.

O’Reilly: Are you keeping the Twitter stream? Are you archiving the firehose?

Lu: I don’t want to answer that specific question because I want to be accurate.

Question: Can you talk about the data-sharing between you and Yahoo? And lastly with Twitter, is it a non-exclusive?

Lu: It’s non-exclusive. With Yahoo, there’s a set of core principles. There’s a complete set of privacy protections in how the MSFT-YHOO partnership will work.

