Join gaming leaders online at GamesBeat Summit Next this upcoming November 9-10. Learn more about what comes next. 


Google’s AI-driven Gboard keyboard now supports more than 500 language varieties — up from 100 languages in December 2016 and 300 in March 2017. That’s according to technical program manager Daan van Esch, who revealed the figure in a blog post this morning.

Newly supported dialects include Nigerian Pidgin, Rangpuri, Balinese, and Pontic Greek, which join more than 40 writing systems ranging from alphabets like Roman and Cyrillic to scripts such as Ol Chiki (in Santali). The Google Play Store description lists the full range of languages.

“This means that more than 90 percent of the world can now type in their first language with Gboard, with keyboard layouts tailored to each language and typing smarts like autocorrect and predictive text,” he wrote. “Our goal with Gboard is to help you communicate in a way that’s comfortable and natural, regardless of the language you speak.”

Adding a new language to Gboard isn’t as easy as it sounds. It first requires designing an entirely new keyboard layout, Van Esch said, and then crafting a new machine learning language model to autocorrect typing and predict next words. Then there’s the matter of finding datasets, or corpora, to train those models — data that isn’t always readily available.

“For languages like English, which has only about 30 characters and large amounts of written materials widely available, this is easy,” he wrote. “For many of the world’s languages, though, this process is much harder.”

In some cases, to produce new corpora, Google has shared writing prompts with native speakers, who’ve helped to make them from scratch. From those and other available data, Gboard’s engineers attempt to figure out which characters to include in layouts and to determine how frequently they’re used.

“Depending on the language, we may tailor aspects of the layout, like the set of digits — for example, while English uses ‘0123456789,’ Hindi and other Indian languages written in Devanagari use ‘०१२३४५६७८९,'” Van Esch said. “Once we’ve built support for a language, we always invite a group of native speakers to test and fill out a survey to understand their typing experience.”

Those aren’t the only AI-infused components of Google’s software keyboard.

Back in June 2017, Google rolled out a feature that uses machine learning to match doodles with emoji, and it overhauled the AI models Gboard uses to improve typing predictions and reduce errors. More recently, in August, it launched an AI-powered tool that uses the likeness of a person to generate a sticker pack and recommend GIFs, emojis, and stickers relevant to conversations at hand.

VentureBeat

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member