3 tips to reduce bias in AI-powered chatbots

AI-powered chatbots that use natural language processing are on the rise across all industries. A practical application is providing dynamic customer support that allows users to ask questions and receive highly relevant responses. In health care, for example, one customer may ask "What's my copay for an annual check-up?" and another may ask "How much does seeing the doctor cost?" A smartly trained chatbot will understand that both questions have the same intent and provide a contextually relevant answer based on available data.

What many people don't realize is that AI-powered chatbots are like children: They learn by example. Just like a child's brain in early development, AI systems are designed to process huge amounts of data in order to form predictions about the world and act accordingly. AI solutions are trained by humans and synthesize patterns from experience. However, there are many patterns inherent in human societies that we don't want to reinforce -- for example, social biases. How do we design machine learning systems that are not only intelligent but also egalitarian?

Social bias is an increasingly important conversation in the AI community, and we still have a lot of work to do. Researchers from the University of Massachusetts recently found that the accuracy of several common NLP tools was dramatically lower for speakers of "non-standard" varieties of English, such as African American Vernacular English (AAVE). Another research group, from MIT and Stanford, reported that three commercial face-recognition programs demonstrated both skin-type and gender biases, with significantly higher error rates for females and for individuals with darker skin. In both of these cases, we see the negative impact of training a system on a non-representational data set. AI can learn only as much as the examples it is exposed to -- if the data is biased, the machine will be as well.

Bots and other AI solutions now assist humans with thousands of tasks across every industry, and bias can limit a consumer's access to critical information and resources. In the field of health care, eradicating bias is critical. We must ensure that all people, including those in minority and underrepresented populations, can take advantage of tools that we've created to save them money, keep them healthy, and help them find care when they need it most.

So, what's the solution? Based on our experience of training with IBM Watson for more than four years, you can minimize bias in AI applications by considering the following suggestions:

Be thoughtful about your data strategy;
Encourage a representational set of users; and
Create a diverse development team.

1. Be thoughtful about your data strategy

When it comes to training, AI architects have choices to make. The decisions are not only technical, but ethical. If our training examples aren't representative of our users, we're going to have low system accuracy when our application makes it to the real world.

It may sound simple to create a training set that includes a diverse set of examples, but it's easy to overlook if you aren't careful. You may need to go out of your way to find or create datasets with examples from a variety of demographics. At some point, we will also want to train our bot on data examples from real usage, rather than relying on scraped or manufactured datasets. But what do we do if even our real users don't represent all the populations we'd like to include?

We can take a laissez-faire approach, allowing natural trends to guide development without editing the data at all. The benefit of this approach is that you can optimize performance to your general population of users. However, that may come at the expense of an underrepresented population that we don't want to ignore. For example, if the majority of users interacting with a chatbot are under the age of 65, the bot will see very few questions about medical services that apply only to an over-65 population, such as osteoporosis screenings and fall prevention counseling. If bots are only trained on real interactions, with no additional guidance, it may not perform as well on questions about those services, which disadvantages older adults who need that information.

In order to combat this at my company, we create synthetic training questions or seek another data source for questions about osteoporosis screenings and fall prevention counseling. By strategically enforcing more distribution and representativeness in our training data, we allow our bot to learn a wider range of topics, without unfair preference for the interests of the majority user demographic.

2. Encourage a representational set of users

We don't have complete control over who interacts with a chatbot or any AI system, but we can ensure that it is equally available to all populations and remove any barriers that hinder equal use by all populations.

The earlier example imagined that the majority of chatbot users were under the age of 65. Perhaps we could also add design options (such as larger text) that would make it easier for older adults to use the tool. By designing our AI solutions with inclusivity in mind, we are likely to find many other ways to tweak content, user experience, marketing, and basic capabilities to reach the full range of people the chatbot will serve.

Even with a diverse set of real users, however, companies may run the risk of introducing a bot to the unconscious biases of the training and development team. To minimize this risk, it's critical to allow a wide range of perspectives into the design process. This leads to my final suggestion ...

3. Create a diverse development team

If a well-rounded and diverse team informs your decision-making, you are less likely to introduce new biases into the system. But diversifying development teams still proves to be a challenge, especially in the AI field. How do we break down the barriers of entry into the field?

In my experience, the greatest barriers I've faced were of my own making. I didn't grow up coding, and until I was in the middle of it, I had convinced myself that AI was for other people -- smarter people, more talented people. Some of my team members have traditional AI backgrounds in data science, programming, and language technologies, but some of the most critical members of our team come from non-technical backgrounds. With a willingness to learn from each other, we routinely find that each of our perspectives contributes something unique to our bot. You don't have to be a machine learning expert to be valuable to an AI team. And if you want to become a machine learning expert, then you can become one.

It's important to create an environment that encourages growth and empowers people from all walks of life to participate in development. In the competitive business world, it sometimes goes against all of our natural instincts, but the truth is that sharing knowledge is the quickest path to success, both for the product and for the people creating it. AI is a field that can and should be accessible to everyone.

IBM design researcher Ellen Kolstø noted in an article on Epic People that "while [AI] is certainly about machines, the building of AI is very much about humans." When we are designing an artificially intelligent system, we are often making very human choices. If our bot learns by example, then we are responsible for setting a good example. By creating representational training sets, diversifying our development teams, and making our bot available to people of all backgrounds and demographics, we can reduce built-in biases and foster a new wave of egalitarian AI.

Allison Langley is an applied AI scientist at Welltok, a company that develops SaaS-based consumer health solutions for the health care industry.

1. Be thoughtful about your data strategy

2. Encourage a representational set of users

3. Create a diverse development team

More