Facebook claims it proactively detected 95% of hate speech removed in Q2 2020

About 22.5 million pieces of content published to Facebook were removed for violating the company's hate speech policies in Q2 2020. The metric comes from Facebook's latest Community Standards Enforcement Report covering April 2020 through June 2020, which suggests the company's AI detected 95% of hate speech taken down in Q2. That's up from 88.8% in the previous quarter, 80.2% in Q4 2019, and 0% as recently as four years ago.

Facebook attributes the uptick to an expansion of its AI technologies in languages such as Spanish, Arabic, and Indonesian during Q1, complementing improvements to English-language detection. In Q2, further enhanced automation capabilities enabled swifter takedowns of posts in English, Spanish, and Burmese, according to the company.

On Instagram, Facebook says its automated hate speech detection systems improved from 45% to 84% as the amount of content it took action on increased from 808,900 in Q1 2020 to 3.3 million in Q2. Those rises were driven by expanding the detection technologies in English and Spanish, the company claims.

It's worth noting this latest report comes with a number of caveats. While many of the content moderators Facebook sent home in March to mitigate the spread of COVID-19 have since been brought back online, the company says the metrics "show the impact" of the pandemic on the moderation team in that the number appeals was lower in Q2 because Facebook couldn't always offer them. In instances where Facebook believed there was a moderation mistake, it let users opt for a manual review, after which moderators restored content where appropriate.

Facebook also says that because it prioritized removing harmful content in Q2, it was unable to determine the prevalence of things like violent and graphic content, adult nudity, and sexual activity on its platform. Facebook anticipates it will be able to share metrics around those areas in the next quarter.

Alongside today's report, Facebook says it's working internally to assess how the metrics it publishes can be audited "most effectively." In addition, this week, the company is issuing a Request For Proposal to external auditors to conduct an independent audit of its Community Standards Enforcement Report metrics. It plans to begin this in 2021 and to publish the results sometime that year.

Facebook's efforts to offload content moderation to AI and machine learning algorithms have been historically uneven. In May, Facebook's automated system threatened to ban the organizers of a group working to hand-sew masks on the platform from commenting or posting, informing them that the group could be deleted altogether. It also marked legitimate news articles about the pandemic as spam.

There's also evidence that objectionable content regularly slips through Facebook's filters. In January, Seattle University associate professor Caitlin Carlson published results from an experiment in which she and a colleague collected more than 300 posts that appeared to violate Facebook's hate speech rules and reported them via the service's tools. Only about half of the posts were ultimately removed.

More damningly, a recent NBC report uncovered thousands of groups and pages, with millions of members and followers, that support the QAnon conspiracy theory. A separate NBC investigation revealed that on Instagram in the U.S. last year, Black users were about 50% more likely to have their accounts disabled by automated moderation systems than those whose activity indicated they were white.

NBC alleges signs of algorithmic bias were ignored at the company. Internal researchers were told not share their findings with coworkers or conduct further investigatory work. Instagram ended up implementing a slightly different moderation algorithm but declined to let the researchers test an alternative.

Civil rights groups including the Anti-Defamation League, the National Association for the Advancement of Colored People, and Color of Change claim that Facebook fails to enforce its hate speech policies, and they organized an advertising boycott in which over 1,000 companies reduced spending on social media advertising. A July civil rights audit of Facebook's practices found the company failed to enforce its voter suppression policies against President Donald Trump, and while CEO Mark Zuckerberg has defended the company's hands-off approach, Facebook's own employees have pushed back by staging a series of virtual walkouts.

During a briefing with members of the media today, Guy Rosen, Facebook's VP of integrity, said Facebook is now relying on AI to create a ranking system that prioritizes critical content for moderation teams to review. The AI evaluates how severe the threat in a piece of content might be -- for example, a video with someone expressing suicidal intention -- and flags it for expedited review. "The AI ranks the content regardless of whether it was reported by users or detected proactively," said Rosen. "This enables our teams to spend their time on cases where we need their expertise."

Facebook also said it's tweaking its community standards to ban "implicit hate speech" on its platforms, including blackface and anti-Semitic stereotypes, and will soon take down content in violation of this new policy. After consulting with 60 outside organizations and experts, the company says it will no longer allow depictions of "caricatures of black people in the form of blackface" or "Jewish people running the world or controlling major institutions such as media networks, the economy, or the government."

The ban on Jewish stereotypes goes into effect today, Monika Bickert, Facebook VP of content policy, said during the briefing. Enforcement of the ban on blackface will start later this month.

In a separate effort to bolster its moderation efforts, Facebook recently made available an image corpus of 10,000 "hateful memes" scraped from public Facebook groups in the U.S. It's a part of the Hateful Memes challenge, which will offer $100,000 in prizes for teams developing AI systems to identify photos targeting race, ethnicity, gender, and negative stereotypes as mean-spirited, with a final competition scheduled to take place at the NeurIPS 2020 AI and machine learning conference in December.

Facebook also announced today it will begin limiting the reach of U.S.-based publishers with overt and often misleading connections to political groups. The new policy defines political outlets as ones owned by a political person or entity, led by a political person, or as an organization that shares proprietary information gathered from its Facebook account with a political person or entity. While they'll still be allowed to register as news organizations and advertise on Facebook, they won't be included in Facebook's News tab.