Web 2.0: Facebook's challenges in scaling to 300 million users

When Facebook’s main site takes an extra fraction of a second to load, it can be hard to appreciate the job the company’s done in building a unique home page for each of its 300 million monthly active users.

Furthermore, it’s not a homepage that’s siloed or separate from the experiences of others. Every time you log-on, Facebook has to pull data and updates from hundreds or thousands of friends and condense them into the 45 most interesting updates that appear in your news feed. Today at the Web 2.0 Summit in San Francisco, the company’s vice president of engineering Mike Schroepfer talked about the sheer amount of data Facebook’s system has to wade through.

By the numbers, here’s the data Facebook has to parse through:

  • People spend 8 billion minutes online on Facebook every day.
  • There are 2 billion photos uploaded each month and 20 billion total.
  • There are 15,000 other partners accessing Facebook data through its Connect service.
  • Yesterday, the company serviced over 5 billion API calls. Facebook Platform is growing at a much faster rate than the web site.

Schroepfer said Facebook’s architecture is fundamentally different from its predecessors.

“The way most traditional web sites scale is a really well-understood pattern: more people show up, I add more servers and I split up my databases,” he said. “It’s a perfectly horizontally scalable problem. You can literally throw more machines at the problem.”

He said when Facebook originally launched, it was like this. The company had separate college networks where people could only friend others at their university. But it changed when the company started letting people find friends from different universities, and then again in 2006, when anybody could join the social network.

“Doing this on a traditional infrastructure is not feasible,” he said.

Facebook responded by building a multi-feed system that can calculate 50 million operations per second. “It’s custom-tuned to allow you to see what’s happening in your friend network right now and give that result in between zero and 20 milliseconds.”

They took an open-source distributed memory system called Memcache and customized it, scaling it up more than five times. Schroepfer said Facebook’s culture emphasizing rapid development and big-broad thinking among small teams was key to this. Today the company has 1.2 million users for every engineer it has.

VentureBeat is studying mobile marketing automation. Chime in, and we’ll share the data.