Google recently unveiled its vision for the future of global communication — a pair of headphones that can translate over 40 languages in real time. But are we sure that we can rely on Google Translate to understand the nuances of spoken language and the complexities of culture?
The company has made enormous linguistic progress in the past 12 months. In late 2016, Google announced that it had made a breakthrough in translation and artificial intelligence. Google Translate had started using a neural network to translate some of its most popular languages.
Last week, the company unveiled its latest invention: Google Pixel Buds, or headphones that can serve as a real-time personal translator. A live demonstration highlighted the simplicity of the new product as it translated a few sentences back and forth between Swedish and English.
“It’s like you’ve got your own personal translator with you everywhere you go,” said Adam Champy, the product manager behind Pixel Buds. “Say you’re in Little Italy, and you want to order your pasta like a pro. All you have to do is hold down on the right earbud and say, “Help me speak Italian.” As you talk, your Pixel phone’s speaker will play the translation in Italian out loud. When the waiter responds in Italian, you’ll hear the translation through your Pixel Buds.”
The product has been heralded as the first coming of the Babel fish, which is bound to have fans of The Hitchhiker’s Guide to the Galaxy (basically anyone who writes about tech) in a frenzy of excitement.
Unfortunately for the $600 billion company, there is one core difference between Douglas Adams’ Babel fish and the Google Pixel Buds. The Babel fish was “mindbogglingly useful” because it had the capacity to not only translate vocabulary but to interpret the cultural nuances that came with it.
Likewise, multilingual humans attempt to make cultural sense of what they are translating. Bringing a computer into the equation has the potential to really mess with these subtleties.
In 2015, a local Spanish town threatened to sue Google when the local word for leafy green “rapini” was mistaken for the word “clitoris.” The mistake left local websites, including the town’s official page, with a renamed annual event: The Clitoris Festival.
The syntax and conjugational differences between Asian languages and European languages adds an additional level of complexity. Countless examples on your local Chinese or Thai restaurant menus highlight this difficulty in translation and point to the level of human fluency that is required to make an accurate interpretation. You can safely assume supermarkets were misled when they boldly stated “F**k Vegetables,” instead of correctly labeling an aisle in their store.
And Google Translate is facing an uphill battle if even professional translators are struggling to interpret the sentence structure of Donald Trump’s ramblings.
“When the logic is not clear or a sentence is just left hanging in the air, then we have a problem,” said the Guardian‘s Chikako Tsuruta, who regularly interprets broadcasts by U.S. networks such as CNN, ABC, and CBS. “We try to grasp the context and get at the core message, but in Trump’s case, it’s so incoherent. You’re interpreting, and then suddenly the sentence stops making sense, and we risk ending up sounding stupid.”
Herein lies the number one question that Google Translate product managers will be asking themselves: How do we get our technology to understand and interpret the subjective cultural aspects and highly variable grammatical complexities involved in spoken language?
Despite an unquestionably exciting year for Google’s deep learning and language department, it seems there is still a long way to go.
Gabe McCauley is a freelance journalist and growth marketer based out of Sydney, Australia.
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here