Facebook lets select researchers access 'privacy-protected' data

Facebook today announced the recipients of a grant that offers access to "privacy-protected" data from a fraction of the network's billions of monthly active users around the world. The more than 60 researchers from 30 institutions across 11 countries were selected by two partner organizations, Social Science One and the Social Science Research Council (SSRC). Facebook says it won't play a role in directing their findings in order to ensure the independence of the research.

This program has been several months in the making. Facebook announced last spring that it would seek to promote studies on social media's role in elections by enabling academics to define research agendas and solicit proposals for investigations into various topics, and it said that it would offer access to select data sets. In the intervening weeks, the company has begun building a data-sharing infrastructure that funnels samples from these corpora in what it say is "a secure manner" that "protects people's privacy," in part by removing personally identifiable information and allowing only approved parties access through a portal that leverages two-factor authentication and a virtual private network.

Facebook says it has been testing differential privacy techniques that inject statistical noise into raw data to make sure users can't be re-identified -- without impacting the reliability of research findings. Additionally, it says that its tools limit the number of queries a researcher can run, preventing them from circumventing privacy protections.

The researchers chosen by Social Science One and SSRC (in addition to Social Science One commission members) will be able to use CrowTangle's API to track the popularity of news items and other public posts across Facebook and Instagram data (including public pages, public groups, and verified profiles). They'll also gain access to Facebook's Ad Library API, which provides information about ads related to politics or issues in the U.S., U.K., Brazil, India, Ukraine, Israel, and the European Union, and to a separate data set of URLs containing links (along with their total shares, text summaries of content, engagement stats, and fact-checking rating) that were shared on Facebook by at least 100 unique users who've posted them with public privacy settings.

Facebook says researchers will be required to attend a June training session about the URL data set and the research tool in order to use it.

"We hope this initiative will deepen public understanding of the role social media has on elections and democracy and help Facebook and other companies improve their products and practices," wrote vice president for special projects Elliot Schrage and strategic initiatives manager Chaya Nayak in a blog post. "This initiative will deepen our work with universities around the world as we continue to improve our ability to address current threats and anticipate new ones ... Over the coming months, we will continue to explore ways to expand the scope of the data we make available to researchers in line with our commitment to privacy."

The rollout of Facebook's research program comes nearly a year after a report by the Guardian revealed that data analytics company Cambridge Analytica improperly obtained the data of up to 87 million Facebook users through a paid personality quiz. Facebook suspended Cambridge Analytica and SCL Group, its parent company, from the platform in mid-March of 2018, after the former used the data to create "psychological profiles" of U.S. voters for ad targeting.

Following the controversy, Facebook instituted new policies substantially limiting the amount of data app developers can collect.

More