Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More


The same week Facebook open-sourced M2M-100, an AI model that can translate between over 100 languages, Microsoft detailed an algorithm of its own — Turing Universal Language Representation (T-ULRv2) — that can interpret 94 languages. The company claims T-ULRv2 achieves the top results in XTREME, a natural language processing benchmark created by Google, and will use it to improve features like Semantic Search in Word and Suggested Replies in Outlook and Teams ahead of availability in private preview via Azure.

T-ULRv2, a joint collaboration between Microsoft Research and the Microsoft Turing team, contains a total of 550 million parameters, or internal variables that the model leverages to make predictions. (By comparison, M2M-100 has around 15 billion parameters). Microsoft researchers trained T-ULRv2 on a multilingual data corpus from the web that consists of the aforementioned 94 languages. During training, the model learned to translate by predicting masked words from sentences in different languages, occasionally drawing on context clues in pairs of translations like English and French.

As Microsoft VP Saurabh Tiwary and assistant managing director Ming Zhou note in a blog post, the XTREME benchmark covers 40 languages spanning 12 families and 9 tasks that require reasoning about varying levels of syntax. The languages are selected to maximize diversity, coverage in existing tasks, and availability of training data, and the tasks cover a range of paradigms including sentence text classification, structured prediction, sentence retrieval, and cross-lingual question answering. For models to be successful on the XTREME benchmarks, then, they must learn representations that generalize to many standard cross-lingual transfer settings.

T-ULRv2 beat the previous best model from Alibaba (VECO) by 3.5 points on average. It also surpassed the performance of Microsoft’s FILTER, Google’s XLM-R, and New York University’s X-STILTs. “The Microsoft Turing team has long believed that language representation should be universal,” Tiwary and Zhou wrote. “We [recently] presented a method to train language-agnostic representation in an unsupervised fashion. This kind of approach would allow for the trained model to be fine-tuned in one language and applied to a different one in a zero-shot fashion. This would overcome the challenge of requiring labeled data to train the model in every language.”

Event

Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.

 

Register Now

The jury is out on T-ULRv2’s potential for bias and its grasp of general knowledge. Some research suggests benchmarks such as XTREME don’t measure models’ knowledge well and that models like T-ULRv2 can exhibit toxicity and prejudice against demographic groups. But the model is in any case a step toward Microsoft’s grand “AI at scale” vision, which seeks to push AI capabilities by training algorithms with increasingly large amounts of data and compute. Already, the company has used its Turing family of models to bolster language understanding across Bing, Office, Dynamics, and its other productivity products.

It’s also unclear the extent to which the size of models like T-ULRv2 might correspond to performance improvements in the future. A study published by researchers at MIT found that progress in deep learning has been strongly reliant on increases in compute and that continued progress will require dramatically more computationally efficient deep learning methods, either through changes to existing techniques or via new as-yet-undiscovered methods. On the other hand, an OpenAI survey found that since 2012, the amount of compute needed to train an AI model to the same performance on classifying images in a popular benchmark (ImageNet) has been decreasing by a factor of two every 16 months.

In this regard, Microsoft is well-positioned to keep throwing compute at the problem. To OpenAI’s research alone, it devoted over 285,000 processor cores, 10,000 graphics cards, and 400 gigabits per second of connectivity for each graphics card server. (Microsoft and OpenAI have an ongoing commercial partnership to jointly develop new technologies for Microsoft’s Azure cloud platform.)

“Microsoft Office and Microsoft Bing are available in over 100 languages across 200 regions. We have customers in every corner of the planet, and they use our products in their native languages,” Tiwary and Zhou continued. “To truly democratize our product experience to empower all users and efficiently scale globally, we are pushing the boundaries of multilingual models.”

T-ULRv2 will power current and future language services available through Azure Cognitive Services, Microsoft says. It will also be available as a part of a program for building custom applications, which was announced at Microsoft Ignite 2020 earlier this year. Developers can submit requests for access.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.