Dev

Under the hood: How Facebook built Trending topics with natural language processing

Connect with leaders from the companies in this story, in real life: Come to the fourth annual VentureBeat Mobile Summit April 14-15 in Sausalito, Calif. Request an invitation.

Today, Facebook launched Trending, a new feature that shows you relevant-to-you topics that are spiking in popularity.

It’s like Twitter’s trending topics feature, except that every person on the network sees a different list of topics based on their own personal interests, Likes, friends, location, etc.

In a conversation with Chris Struhar, a software engineer on News Feed, we learned a bit about what makes Trending tick.

First, let’s dispel the myth that Trending is in any real way linked to hashtags, which the company introduced last year.

“Hashtags and topics are two different ways of grouping and participating in conversations,” said Struhar. So don’t think Facebook won’t recognize a string as a topic without a hashtag in front of it.

Rather, it’s all about NLP: natural language processing. Ain’t nothing natural about a hashtag, so Facebook instead parses strings and figures out which strings are referring to nodes — objects in the network.

“We look at the text, and we try to understand what that was about,” said Struhar.

“We’re separate from the Graph Search team, but both products want to give you more control over what you see on Facebook, to slice and dice the graph and get different pieces of information.”

Graph Search, which is basically a database query language for dummies, uses a lot of NLP to populate queries for Facebook users. But more interesting is how Facebook engineers have worked incredibly hard to process the natural language lying all around the network in form field entries, status updates, Notes, and comments.

All those strings get parsed into what Facebook calls entities — nodes in the network — including people, places, things, events, topics, etc. And each node has many edges, such as Likes, checkins, hashtags, comments, etc. And then there’s the junk data.

Graph Search operates based on a thorough understanding of these nodes and edges based on NLP. And so does Trending.

“Both [trending and Graph Search] do use the technology that takes a string of text and tries to understand the node in the graph you’re referring to,” said Struhar.

“Some of the more interesting problems involved ironing out the algorithms. For example, we saw that lunch was trending every single day right around noon. It makes sense, but it’s not the kind of product experience we want to create. So we compared the number of people that are talking about that topic now to the number of people that were talking about that topic a day ago.”

In addition to looking for minute-by-minute spikes in overall popularity, Facebook also has to personalize Trending for each end user.

“We do look at location,” said Struhar, also reiterating that Likes and friendships play a huge role in what topics show up for each person.

“And we try to personalize content based on what you’re going to be interested in,” he continued. In some cases, your location might send a false positive signal as to your interests. For example, you live in Baltimore but don’t care about Orioles news, or Nelson Mandela dies and people care around the globe, regardless of location.

Graph Search-like filtering options may come in later iterations of the feature, said Struhar, and content across all topics will follow Facebook’s overarching guidelines for content, including what minors are permitted to see.

Ultimately, said Struhar, “This is just once piece in a larger puzzle for where we want to take News Feed. We want to turn it into a personal newspaper.

“Up until now, we’ve been focused on connecting you to your friends — and they will always be the epicenter of your Facebook. But there’s lots of other stuff that’s happening in the world that’s interesting to me, and we want to get better at showing you that, too.”

And as the network evolves in that direction, the company’s engineers will get ever better at NLP, predictive computing, machine learning, and all the tricky parts of computer science that bring us closer to a true artificial intelligence.

(Did you just get goosebumps? Don’t get too excited. Remember: At the end of the day, it’s all about selling ads. That’s what I have to tell myself when I get too hyphy about the genius nerds in Facebook Engineering.)


VentureBeat is studying mobile monetization. If you're a mobile developer or publisher, fill out our quick survey, and we'll share the resulting data with you.
blog comments powered by Disqus