Nvidia CEO Jensen Huang: Merlin will power ‘the most important AI model in the world’

Nvidia today released the GTC Digital conference keynote address, which CEO Jensen Huang filmed in his kitchen. In his speech, Huang rolls out the new Ampere GPU architecture, AI-driven health care solutions for smart hospitals, and the A100 GPU, which promises 20 times faster training and inference than Volta. Huang also introduced Merlin, an application framework for recommendation systems that Huang considers "the most important AI model in the world today," one that "drives the vast majority of the economic engine of the internet."

Recommendation systems can decide which items shoppers see in an ecommerce store or personalize results seen on a site like Netflix or Microsoft's Xbox. And these systems can easily balloon in size, due to the amount of data they collect.

Huang predicts AI models powered by A100 are generally about to get much bigger as a result of data-intensive recommendation models and multimodal AI that takes input from multiple forms of media, such as text, vision, or sound.

"It's a foregone conclusion that we're going to see some really gigantic AI models because of the creation of Ampere and this [GPU] generation," Huang said. "In the future, it's going to have contextual information, continuous information, [and] sequence information because of the way you make the pattern by which you're using a website or engaging a store. These models are going to be gigantic. We're going to do that for robotics, when you enter a whole lot of different domains."

Manipulative recommendation systems have drawn attention from members of the machine learning community, including Celeste Kidd, who touched on it in a keynote address at NeurIPS last year. In a blistering 2018 critique of Facebook, François Chollet, creator of deep learning library Keras, urged the AI community to create recommendation engines that are "transparent, configurable, and constructive," not slot machines for maximizing profit or political gain.

Nvidia also announced limited availability of the Jarvis multimodal conversational AI application framework today. Jarvis will combine graphics and conversational AI to make what Huang calls interactive 3D chatbots. You can see a demo in the video below.

Since the reemergence of deep learning, the amount of compute necessary to train top AI models has been steadily increasing. For example, Nvidia director of product management Paresh Kharya pointed out that it took 3,000 times less compute to train ResNet-50 in 2016 than it took to train Megatron, a BERT-based language model with billions of parameters.

Nvidia shared a number of other newsworthy updates today, including the introduction of GPU acceleration for data scientists using Apache Spark or public clouds like Microsoft's Azure or AWS. The company also debuted the EGX A100 for edge AI, due out in late 2020, and announced the early access launch of the Omniverse graphics and simulation platform.

The GTC Digital conference was originally scheduled to take place in March near Nvidia headquarters in San Jose, California.

More