Google's DynamicEmbedding framework extends TensorFlow to 'colossal-scale' applications

In a preliminary whitepaper published this week on Arxiv.org, Google researchers describe DynamicEmbedding, which extends Google's TensorFlow machine learning framework to "colossal-scale" applications with arbitrary numbers of features (e.g., search queries). According to Google, AI models developed on it have achieved significant accuracy gains over the course of two years, demonstrating that they can grow "incessantly" without having to be constantly retuned by engineers.

Currently, DynamicEmbedding models are suggesting keywords to advertisers in Google Smart Campaigns, annotating images informed by "enormous" search queries (with Inception), and translating sentences into ad descriptions across languages (with Neural Machine Translation). Google says that many of its engineering teams have migrated algorithms to DynamicEmbedding so that they can train and retrain them without much data preprocessing.

DynamicEmbedding could be useful in scenarios where focusing on the most frequently occurring data might cast aside too much valuable information. That's because the framework grows itself by learning from potentially unlimited novel input, enabling it to self-evolve through model training techniques like transfer learning (where a model trained on one task is repurposed on a related task) and multitask learning (where multiple learning tasks are solved at the same time).

Building DynamicEmbedding into TensorFlow required adding a new set of operations to the Python API that take symbolic strings as input and "intercept" upstream and downstream signals when running a model. These operations interface with a server called DynamicEmbedding Service (DES) to process the content part of a model. This talks to a DynamicEmbeding Master module that divvies up and distributes the work among agents -- called DynamicEmbedding Workers. Workers are principally responsible for allocating memory and computation and communicating with external cloud storage, as well as ensuring that all DynamicEmbedding models remain backward compatible.

Courtesy of a component called EmbeddingStore, DynamicEmbedding integrates with external storage systems like Spanner and Bigtable. Data can be stored in local cache and remote, mutable databases. This allows fast recovery from worker failure, as DynamicEmbedding doesn't need to wait until all the previous data is loaded before accepting new requests.

Google says that in experiments DynamicEmbedding was shown to substantially reduce memory usage in training a model architecture known as Seq2Seq, which turns one sequence into another sequence. With 100 TensorFlow workers and a vocabulary size of 297,781, it needed between 123GB and 152GB of RAM, compared with TensorFlow's 242GB of RAM to achieve the same level of accuracy.

In a separate experiment, the Smart Campaign model on DynamicEmbedding -- which has been deployed in production for more than a year -- outperformed non-DynamicEmbedding models in metrics such as click-through rate across 20 languages. In fact, the DynamicEmbedding-powered models won 49 out of a total of 72 revaluation metrics Google used for the dozens of different countries it evaluated.

"Our [Smart Campaign] model has been fed with new training data every month, and its size ... has been automatically growing from a few gigabytes to hundreds of gigabytes in less than six months," wrote the paper's coauthors. They noted that as of February 2020 the Google Smart Campaign model contained over 124 billion parameters (the configuration variables estimated from data and required by the model when making predictions). "We hope that [DynamicEmbedding] can be used in a wide variety of machine learning applications [that] face challenges around ever-growing scale in data inputs."

More