In a paper published on the preprint server Arxiv.org, researchers affiliated with Microsoft and Arizona State University propose an approach to detecting fake news that leverages a technique called weak social supervision. They say that by enabling the training of fake news-detecting AI even in scenarios where labeled examples aren’t available, weak social supervision opens the door to exploring how aspects of user interactions indicate that news might be misleading.

According to the Pew Research Center, approximately 68% of U.S. adults got their news from social media in 2018 — which is worrisome, considering misinformation about the pandemic, for instance, continues to go viral. Companies from Facebook and Twitter to Google are pursuing automated detection solutions, but fake news remains a moving target, owing to its topical and stylistic diversity.

Building on a study published in April, the coauthors of this latest work suggest that weak supervision — where noisy or imprecise sources provide data labeling signals — could improve fake news detection accuracy without requiring fine-tuning. To this end, they built a framework dubbed Tri-relationship for Fake News (TiFN) that models social media users and their connections as an “interaction network” to detect fake news.

Interaction networks describe the relationships between entities like publishers, news stories, and users. Given an interaction network, TiFN aims to embed different types of entities, based on the observation that people tend to interact with like-minded friends. In making its predictions, the framework also accounts for the fact that connected users are more likely to share similar interests in news pieces, that publishers with a high degree of political bias are more likely to publish fake news, and that users with low credibility are more likely to spread fake news.

VB Transform 2020 Online - July 15-17. Join leading AI executives: Register for the free livestream.

To test whether TiFN’s weak social supervision could help to detect fake news effectively, the team validated it against a Politifact data set containing 120 true news pieces and 120 verifiably fake pieces shared among 23,865 users. Versus baseline detectors that consider only news content and some social interactions, the researchers report that TiFN achieved between 75% to 87% accuracy, even with a limited amount of weak social supervision (within 12 hours after the news was published).

In another experiment — involving a separate custom framework called Defend — the researchers sought to use news sentences and user comments explaining why a piece of news is fake as a weak supervision signal. Tested on a second Politifact data set consisting of 145 true news and 270 fake news pieces with 89,999 comments from 68,523 users on Twitter, they say Defend achieved 90% accuracy.

“[W]ith the help of weak social supervision from publisher-bias and user-credibility, the detection performance is better than those without utilizing weak social supervision. We [also] observe that when we eliminate [the] news content component, user comment component, or the co-attention for news contents and user comments, the performances are reduced. [This] indicates capturing the semantic relations between the weak social supervision from user comments and news contents is important,” wrote the researchers. “[W]e can see [that] within a certain range more weak social supervision leads to a larger performance increase, which shows the benefit of using weak social supervision.”