Here's what Facebook Graph Search is doing next

When you take a billion people and throw in a few trillion places, things, concepts, attributes, and connections, you get a data set of mind-boggling scope. So when Facebook decided to launch Graph Search last month, it emphasized that the keyword-search paradigm wasn't gonna cut it this time.

Graph Search was born, then, as object-oriented search, where each person, place, or thing is a node with hundreds or even thousands of attributes, defined as "edges" or connections to other nodes.

Right now, Graph Search is still just five weeks old and has rather limited functionality; Facebook says it's about one percent complete. For now, you can perform simple queries to find friends of yours who live in a certain place or like certain things.

What Facebook has coming up, however, is much more interesting, complex, and useful.

Let's take a computer-science detour into the weeds. Edge identifiers -- those attributes that show what you, a Facebook node, are connected to -- are just numbers. There's a unique number (between one and nineteen digits long) assigned to each attribute, for example, people who are your friends, people who live in San Francisco, people who like Kanye West, etc. Take a little and operator in Graph Search, and you can search for your friends who live in San Francisco and like Kanye West.

If you read the Facebook Engineering blog post by Lars Rasmussen on how Graph Search works, this is the backbone of Unicorn, the technology behind all Facebook search.

This kind of search is simple. It shows off things you're already connected to, narrowing down by common attributes or edges -- what people like, where they work, whom they know, homing in on specific individuals. It's like taking a metal detector to a haystack to find the one needle you want.

"But what would be really cool is if we could help people find entirely new haystacks," said Facebook engineer Mike Curtiss in a meeting today at Facebook's Menlo Park, Calif., headquarters, "to search for things that are not directly connected to them."

More complex queries require the "apply" operator. This operator "is what makes Graph Search possible," said Curtiss. It's not just keyword inputs and outputs; it's object-oriented search that takes entire nodes in a web as inputs and outputs. Thinking algebraically, it acts as parentheses, allowing you to sort information in very specific ways, filtering and filtering again until you get the single gold nugget you're panning for.

For example, imagine you're a high school student in your junior year, starting on college applications, and you want to find friends of friends who went to Harvard to ask for helpful pointers. Or you're an HR recruiter and want to find photos of your friends (i.e., potential new hires) who like Burning Man (i.e., have a drug and/or performance art problem). Or you want to find Russian restaurants in San Francisco that are liked by people who list Moscow as their hometown.

"We can answer a virtually infinite number of queries, we can answer queries we didn't even know people would do," said Curtiss. "There are some limitations, but there's nothing architecturally preventing it from going really far.

"The biggest problem is the result set sizes tend to increase exponentially. If a node is connected to 100 other nodes ... you can get 10,000 output nodes. In another round of execution, you get one million output nodes. This is literally an exponential problem, a difficult problem to scale. ... You can't just solve it by throwing more machines at it."

That being said, certain Open Compute projects like the radly named Dragonstone server were created with projects like Graph Search in mind.

And until the hardware is perfectly capable of crunching billions of nodes with the greatest of ease, Facebook is using social ranking to show the most relevant or interesting results. Think of it as a combination of Unicorn and the apply operator and EdgeRank (the street name for Facebook's News Feed-sorting algorithm). "It's like we have a custom search engine for each user," Curtiss said.

Once Facebook dumps all the Open Graph data into the mix -- way beyond what pages you like, including what you bought, sites you've commented on, your online game scores, etc. -- the computations get even more complex, the filters for relevance even more clever.

For now, Graph Search is still very much a work in progress, being used by just a few hundred thousand English-speaking users. What is live on Facebook.com today is a baby demo of what Graph Search may eventually become.

Top image courtesy of Jolie O'Dell