Word embedding — a language modeling technique that maps words and phrases onto vectors of real numbers — is a foundational part of natural language processing. It’s how machine learning models “learn” the significance of contextual similarity and word proximity, and how they ultimately extract meaning from text. There’s only one problem: Datasets tend to exhibit gender stereotypes and other biases. And predictably, models trained on those datasets pick up and even amplify those biases

In an attempt to solve it, researchers from the University of California developed a novel training solution that “preserve[s] gender information” in word vectors while “compelling other dimensions to be free of gender influence.” They describe their model in a paper (“Learning Gender-Neutral Word Embeddings“) published this week on the preprint server Arxiv.org.

“[P]rior studies show that … [machine learning] models learned from human-generated corpora are often prone to exhibit social biases, such as gender stereotypes,” the team wrote. “For example, the word ‘programmer’ is neutral to gender by its definition, but an embedding model trained on a news corpus associates ‘programmer’ closer with ‘male’ than ‘female.’ Such a bias substantially affects downstream applications.”

Their learning scheme, which they call Gender-Neutral Global Vectors (GN-GloVe), identifies gender-neutral words while concurrently learning word vectors. The team claims it’s superior to prior approaches because it can be applied in any language, doesn’t remove any gender information from words, and precludes the possibility the words will be misclassified and affect the performance of the model.

Compared to GloVe and Hard-GloVe, two commonly used models, GN-GloVe was highly sensitive to gender-stereotyped words in a newly annotated dataset. While GloVe stereotyped words like “doctor” and “nurse,” GN-GloVe didn’t. Moreover, it exhibited less prejudice overall — in the researchers’ testing, GloVe tended to associate occupations with a specific gender, a bias GN-GloVe reduced by 35 percent.

In the future, the team plans to extend the approach to model other properties of words, such as sentiment.

A bias problem

Broadly speaking, biased datasets plague all fields of AI research.

In a pair of studies commissioned by the Washington Post in July, smart speakers made by Amazon and Google were 30 percent less likely to understand non-American accents than those of native-born speakers. And corpora like Switchboard, a dataset used by companies such as IBM and Microsoft to gauge the error rates of voice models, have been shown to skew measurably toward users from particular regions of the country.

It’s not limited to language.  A study published in 2012 showed that facial algorithms from vendor Cognitec performed 5 to 10 percent worse on African Americans than on Caucasians. More recently, it was revealed that a system deployed by London’s Metropolitan Police produces as many as 49 false matches for every hit.

As of yet, there’s no silver bullet to eliminate bias. But papers such as these — in addition to automated bias detection tools from Microsoft, IBM, Accenture, Facebook, and others — are encouraging signs of progress.