Why AI gets the language of games but sucks at translating languages

As seen at Google DeepMind's conference this week, machine learning with AI has seeped into a number of industries in recent years.

Whereas in the past it was more a topic of discussion on theoretical applications, we now see machine learning being applied in smart cars, video games, digital marketing, virtual personal assistants, chatbots, and other areas of daily life. As AI moves to disrupt and improve more sectors, there are still barriers to overcome before we need to fear for our jobs. In a recent translation competition, human beings beat AI, but it's only a matter of time before machines become digital babel fish.

Game over

It's worth recapping how machine learning and AI have already surpassed human abilities. In 1996, IBM's Deep Blue computer first challenged world-leading chess player Garry Kasparov. While Kasparov won the first time, Deep Blue won the rematch in 1997. Following that competition, computers developed further and are now consistently better than us at chess.

Next on the list was Go, an ancient Chinese board game that seemed too complex for even the most advanced computer to win, owing to the fact that it's said to have more possible moves than atoms in the visible universe. So when Google's DeepMind AlphaGo AI computer program beat Lee Sedol 4-1 in March 2016, it came as a shock.

This month at the Future of Go summit, AlphaGo went on to beat world number one, Ke Jie, who initially claimed that he would never lose to a "cold machine." Afterward, Jie admitted that "the advancement of AI has far exceeded our imagination." At the event, robots not only challenged players, but also worked alongside them, proving that they can help us as well as beat us.

What did you say?

Now the industry's focus is turning to translation. Language production and translation have, for a long time, constituted one of the toughest challenges for any machine to tackle. IBM already explored machine translations way back in the 1950s, but it was not until the '90s, with the development of Altavista's Babel Fish, that such tools became accessible to the public. However, machine translation had its limitations: It translated word by word using dictionaries, offering literal translations without regard for the complexities of semantics, syntax, and morphology.

Statistical machine translations (SMT) became the next phase in translation technology development. SMTs use a model that compares words or phrases to their previous translations (especially professional translations, if available) and then picks the most frequently used wording.

Machine learning and AI were the logical next step in mastering the intricacies of language where standard translation technologies failed. Like a human mind, a machine needs to be able to learn in what context different phrases and sentences are used and evolve over time to produce comprehensible and relevant target language material.

Neural machine translation (NMT) is Google's response to the quest for more accurate translations. NMT technologies focus on the whole sentence instead of its components (word, phrases) in isolation by combining those components in the most naturally used manner. When AI technologies are applied to this process, NMT is also able to learn from other completed translations by analyzing their structure and how they change over time to pick up on subtleties and nuances.

Not there yet

Given how quickly the technology is evolving, it's no surprise that many linguists who make their living from translation are worried about NMT encroaching on their expertise. But equally, there are people and businesses all over the world who are excited at the prospect of language barriers coming down and AI being our new lingua franca.

So there was a mix of trepidation and excitement in February when human translators and Google's new NMT (in combo with Naver Papago) met in competition at Sejong University in Seoul, in collaboration with the International Interpreters & Translators Association of Korea. For translators, it was arguably a bellwether of just how long their jobs would exist.

The competition took 50 minutes and required both parties to translate two randomly selected pieces of as-yet-untranslated text -- one literary and one non-literary piece. The humans beat both AI-based machine translation tools by a clear margin for each type of content and both language combinations (Korean into English, English into Korean).

Many argue that, with translations -- and unlike in mathematics or with games such as chess and Go -- there is not a clear winner, since the review and judgement of the translations is done by humans who may have a subjective view. Considering, however, that an independent judge reviewed the results and the review focused on obvious and objective linguistics errors any native speaker would have spotted, I would argue that the judgments were fair and conclusive.

The reviewers stated that about 90 percent of the NMT-translated text was "grammatically awkward," or perhaps not obviously wrong but definitely never the kind of translation produced by any educated native speaker. Many linguists and translators will be relieved by the resounding success of the humans in this latest battle against the machines.

It's inevitable that, as NMT develops further, technical content -- which follows strict content guidelines and terminology -- may soon be near perfectly translated without requiring much human post-editing, if any.

However, literary and marketing translations -- which require the target text to be almost trans-created based on target market requirements -- will continue to represent a tough challenge for even the most advanced AI machine translation solutions. Translating this kind of content is based on context and research, and it requires a creative mind that resonates with the target audience. In the world of translation and linguistics, the robots have a long way to go until it's checkmate.

Hannes Ben is the Chief International Officer at Forward3D, a digital marketing agency.

Game over

What did you say?

Not there yet

More