Presented by Two Hat Security 

The current COVID-19 pandemic has reshaped our offline and online worlds in record time.

More than ever before, people are turning to online channels to conduct business, communicate with friends and family, keep abreast of the latest news, play games, order food and much more.

They are forming online communities to find entertainment, solace, and connection.

With the increase in online chat, I set out to learn how AI can help ensure user-generated content is moderated so that all users are safe from abusive and negative behavior.

To answer that question, I sat down with Liza Wood, Director of Data and Production at Two Hat, and Carlos Figueiredo, Director of Community Trust & Safety at Two Hat.

Justin: Thanks for your time today, guys. Let’s kick it off by covering the topic of diversity. 

How can AI be an ally in increasing diversity and inclusion in online spaces? How does that help companies create a more positive and welcoming user experience?

Carlos: Provided companies are designing products and experiences with diversity and inclusion in mind, leveraging features that by their very merit encourage prosocial behavior, fostering healthy behavior norms, and crafting community guidelines and moderation strategies that take into consideration all voices in their community, including underrepresented voices, artificial intelligence can be an incredible ally when it comes to scaling online operations.

It can proactively block the worst of the worst types of content (for example, blatant harassment, sexism, misogyny, and hateful/dangerous speech) so it never reaches community members in the first place.

Furthermore, it can reduce overhead and not expose content moderators to this type of content.

Oftentimes companies forget this: a healthy, welcoming, and positive community starts within your own company.

Liza: Once those community guidelines are crafted, they need to be consistently applied so a community understands what behavior is acceptable and not acceptable.

Imagine being a content moderator who has to deal with the good, bad, and the ugly that happens online every day. Content moderators work hard to be fair and consistent, but it’s impossible to do that all the time, especially in the face of extreme behavior.

A well-trained, balanced AI will be consistent in what it was trained to do.

Justin: I’d like to explore that a bit more. It’s been said that AI can’t fix content moderation and that AI systems can’t necessarily make decisions the same way someone like you or I can. What are the current limitations of AI-only systems?

Liza: Training a fair, balanced AI requires tens of thousands to millions of lines of data, collected from a wide diversity of sources. Even within a single language, there are differences in the words we use and the impact they have, depending on the country or region and the demographics of the specific online community.

Also, language is rapidly evolving. Even a continuously learning AI would need to see thousands of examples of a new word or phrase — and know whether it is acceptable or unacceptable — before it can respond to it.

A responsive moderation system needs a way to respond to a trend — either positive or negative — as soon as it’s detected. In the positive case, it is to ensure the conversation continues to flow. In the negative case, it is to address bad actors before they affect the entire community.

Justin: That’s a great segue into my next question about the role that AI and humans can play in content moderation, and the relationship that exists between the two.

What aspects of content moderation can be auto-moderated? Which elements need humans in the processes?

Carlos: In our data, we see that approximately 85% of online chat, usernames, and comments (around 85 billion messages per month!) are what we classify as low risk, essentially the normative behavior we see online: greetings, people sharing positive experiences, conversations about hobbies, invites to game together or to be friends, etc.

For the most part, this content is automatically approved. A small portion of that has mild forms of the topics we identify, like cyberbullying.

For example, the phrase “you are ugly” might be the type of chat that platforms for younger audiences will auto moderate.

On the flipside, approximately 5% of the content we process every month (5 billion give or take), are classified as high risk in our system. Typically, those are very clear examples of harassment, use of racial slurs, explicit sexual conversations and others. Most of our clients automate that work, thus removing the need for a human moderator to spend their time reviewing harmful content and protecting their wellbeing.

Even though I’m talking about technology here, our incredible team of language and culture experts offer invaluable cultural insights that make our automation smarter and highly accurate, which brings me to my final point:

“How about the remainder percentage, which depending on the community can vary between 10% and 15% or more depending on various factors such as audience, platform type, and comfort levels?”

This is where AI can’t replace human expertise, empathy, and nuanced knowledge. The human element of moderation decisions is critical.

A classic example of this is the picture of Phan Thị Kim Phúc, a South Vietnamese-born Canadian woman who’s known as the nine-year-old child depicted in the photograph taken at Trảng Bàng during the Vietnam War on June 8, 1972.

It’s a well-known photo by photographer Nick Ut, with critical historic relevance. It shows Phan at nine years of age as she runs naked on a road after a napalm attack. If the decision is completely left to a machine learning model, it’s likely to block it after classifying it as child nudity.

While true in a strict sense, it will miss a very important layer of understanding that only a trained moderator can offer.

Justin: You just touched on a really interesting point: That AI, while it can be trained, cannot understand the context of what’s being shared.

What are some issues created by unchecked biases being introduced to machine learning models? How can they be minimized?

Liza: That’s a great question. AI will consistently do whatever it was trained to do. So, if the AI is trained on data that has a blind spot, it will consistently have a problem with that blind spot. That is why it is so important to train AI on data that is pulled from a wide diversity of sources.

Before training a model, it is important to check and challenge those sources and any assumptions. Having a diverse team, collaborating with academia, working with international clients and agencies helps to ensure there is a diverse point of view when developing, testing and deploying AI.

Carlos: From a human perspective, which ultimately informs the technology being built, we all have many blind spots and biases. Let’s take one type of bias that certainly creates gaps in the way companies set up content moderation practices: the availability heuristic, also called availability bias. This is a mental shortcut that relies on immediate examples that come to our minds when evaluating a specific topic or decision.

Let’s imagine a moderator who needs to determine if a certain piece of content is a breach of their community guidelines, specifically related to the part about hateful/dangerous speech.

If the moderator or team doesn’t take a balanced, diverse, and inclusive view to how they make decisions and even craft their community guidelines, they might think only in terms of their own experiences within the product, based on their backgrounds, gender, and experiences, inadvertently leaving out the perspective of underrepresented voices and getting to imbalanced decisions.

Justin: I think we can agree then, that AI is a solution, but it’s not the solution. With that in mind, what is Two Hat working on to help bridge the gap between automation and understanding context?

Carlos: Dialogue is one way to create some equity. This is one of the reasons why we collaborate with academia, create partnerships with our clients and international groups, and across different fields. This work ensures we’re always evolving our practices to incorporate multiple voices and perspectives.

For example, we have been working on a taxonomy project that’s going deep into the key topics we classify, making sure that our work is based on the latest studies and knowledge surrounding cyberbullying and other key areas of prevention.

This work is critical to inform better classification and detection, especially as it relates to having a common/shared language so we can better collaborate with our clients and partners.

This helps us ensure we are all using the same concepts, understanding the behaviors we want to identify, and measuring the outcomes that matter the most to online communities which typically involves encouraging positive/normative behavior while preventing disruptive behaviors from taking away from the intended online experiences.

Justin: Sounds like challenging but necessary work if we want to help AI get it right. Liza, Carlos; thank you for your insights!

Liza & Carlos: Our pleasure!

At Two Hat, we believe that intentional, thoughtful technology paired with human expertise can change the world for the better.

To know more about how we are protecting and enabling digital spaces to reach their full potential, you can reach us at or request a demo of Two Hat’s Community Sift.

Justin Kozuch is Digital Content Marketing Specialist at Two Hat Security.

Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. Content produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact