Bots are not a cure-all for hate speech on social media

Facebook's recent struggles with data and user privacy come at a terrible time for them. While the privacy of their users should be of utmost importance, they've also had to work on new technology that addresses concerns around brand and user safety when it comes to hateful content.

In fact, the E.U. has already considered measures and legislation to crack down on hate speech on online platforms and communities such as Facebook, YouTube, and Twitter. This raises an important question: How do you police a platform with over two billion users who generate 510,000 comments every 60 seconds?

During his congressional testimony, Mark Zuckerberg offered this simple answer to the question of how Facebook would regulate hateful content: bots. His comment suggested a challenge AI and machine learning engineers have been trying to tackle -- using AI to crack down on problems around hate speech. But as we look toward the limits of technology, how realistic is that?

A primer on neural networks

Designing a neural net for filtering hate speech is not hard in principle. As an engineer, I would go about this by first showing the AI examples of hate speech and non-hate speech during training, then let it get to work. This attempt to teach the AI would only work in the domain I trained it in, so as new political and social issues came up, the hate-speech filter would start to fall short.

The other option would be to use unsupervised learning techniques. With this approach, the bot would learn to find and censor hate speech based on users' reactions online. An important thing to note is that this avenue would be vulnerable to coordinated attacks. For example, if you and your friends started tagging articles and comments about healthy eating as "hate speech," then an unsupervised AI would start learning to filter out any mention of vegetables as "hateful."

We watched this scenario play out with Microsoft's experimental bot, Tay, which started spouting racist and offensive dialogue as a result of many users feeding her offensive source material.

Virtual whack-a-mole

A confounding factor in designing hate-speech-filtering bots is humans themselves. As a platform starts to filter certain types of messages, people find ways to get around them and new avenues for hate speech appear. This is similar to how Google PageRank changed the way we write content online and bred its own field called search engine optimization, or SEO. Broad and general filters across social media will cause creators of hate speech to change their messaging to sidestep these filters.

By now, you may be asking, "Can't AI overcome that?" Maybe, someday. This is a possibility we’re exploring by using adversarial neural networks -- exposing bots to an AI-powered "adversary." It's a promising field of research, but it still has a long way to go, especially in the field of language and sentiment detection. It's quite interesting to see how we can fool neural nets because they fail in ways humans just don't understand.

In many ways, the metrics Facebook uses to determine if something is hateful or not will come to define the behaviors of people using the platform. Peddlers of hate will appropriate new words and language to convey their messages and bypass the filters. We need to remain true to a set definition of hate speech as we work through implementing bots to curb it on social platforms. There's a very human element to this that we cannot ignore.

When Zuckerberg says we'll have much better bots to deal with this problem in 5-10 years, he isn't wrong. Technology is changing and evolving very quickly and researchers will make advances during that timeline. The problem is, we don't know how good our technology will be at doing these things in five years. We especially don't know how the problem of hate speech will change as we apply these filters to social media.

Bots and AI will provide a huge leg up on detecting and filtering hate speech online in the future, but they're not a silver-bullet solution. Even without AI, humans haven't done a great job at defining what hate speech is and what it isn't. Assuming that AI will answer these philosophical questions for us is naive. Bots won't be able to clearly define the boundaries between hate speech and free speech until we humans can as well.

Eric Moller is the CTO of Atomic X, a Toronto-based firm that offers AI consulting and custom development.

A primer on neural networks

Virtual whack-a-mole

More