Google debuts AI in Google Translate that addresses gender bias

Google today announced the release of English-to-Spanish and Finnish-, Hungarian-, and Persian-to-English gender-specific translations in Google Translate that leverage a new paradigm to address gender bias by rewriting or post-editing initial translations. The tech giant claims the approach is more scalable than an earlier technique underpinning Google Translate's gender-specific Turkish-to-English translations, chiefly because it doesn't rely on a data-intensive gender-neutrality detector.

"We've made significant progress since our initial launch by increasing the quality of gender-specific translations and also expanding it to 4 more language-pairs," Google Research senior software engineer Melvin Johnson wrote. "We are committed to further addressing gender bias in Google Translate and plan to extend this work to document-level translation, as well."

As Johnson explains, the old classifier used for Turkish-to-English gender-specific translations -- which was laborious to adapt to new languages -- failed to produce masculine and feminine translations independently using a neural machine translation (NMT) system. Moreover, it couldn't show gender-specific translations for up to 40% of eligible queries because the two translations often weren't exactly equivalent except for gender-related phenomena.

By contrast, the new rewriting-based method first generates translations and then reviews them to identify instances where a gender-neutral source phrase yielded a gender-specific translation. If that turns out to be the case, a sentence-level rewriter spits out an alternative gendered translation, and both the first and rewritten translations are reviewed to ensure gender is the only difference.

According to Google, building the rewriter involved generating millions of training examples composed of pairs of phrases, each of which included both masculine and feminine translations. Because the data wasn't readily available, the Google Translate team had to come up with candidate rewrites by swapping gendered pronouns from masculine to feminine (or the other way around), starting with a large monolingual data set. To this corpus of rewrites, engineers applied an in-house language model trained on millions of English sentences to select the best candidates, which netted training data that went from a masculine input to a feminine output and vice versa.

After merging the training data from both directions, the team used it to train a one-layer Transformer-based sequence-to-sequence model. Then, they introduced punctuation and casing variants in the training data to increase the model robustness, such that the final model can reliably produce the requested masculine or feminine rewrites 99% of the time.

Evaluated on a Google-developed metric called bias reduction, which measures the relative reduction of bias between the new translation system and the existing system (where "bias" is defined as making a gender choice in translation that's unspecified in the source), Johnson says the new approach results in a bias reduction of ≥90% for translations from Hungarian, Finnish, and Persian to English. The bias reduction of the existing Turkish-to-English system improved from 60% to 95%, and the system triggers gender-specific translations with an average precision of 97% -- i.e., when it decides to show gender-specific translations, it's right 97% of the time.

The improved Google Translate system's rollout comes months after Google removed the ability to label people in images as "man" or "woman" with its Cloud Vision API. Separately, in January 2018, Google blocked Smart Compose, a Gmail feature that automatically suggests sentences for users as they type, from suggesting gender-based pronouns.

A gender-neutral approach to language translation and computer vision is a part of Google's larger effort to mitigate prejudice in AI systems. The Mountain View company uses tests developed by its AI ethics team to uncover bias and has banned expletives, racial slurs, and mentions of business rivals and tragic events from its predictive technologies.