Nvidia enables broader usage of AI with LLM cloud services

In recent years, large language models (LLMs) have become a foundational form of artificial intelligence (AI) models. The challenge, however, has been that creating and training new LLMs is far from a trivial exercise.

At the Nvidia GTC conference today, the company made a long list of announcements spanning the full spectrum of AI operations across multiple industries. One of the key announcements that Nvidia made is about a series of new LLM capabilities, including a pair of cloud services that aim to enable more organizations and individuals to create, train and benefit from LLMs.

[Follow along with VB's ongoing Nvidia GTC 2022 coverage »]

The new cloud offerings include the Nvidia NeMo LLM Service and the Nvidia BioNeMo LLM Service.

"We are announcing NeMo LLM Service to enable customization and inference of giant AI models," Paresh Kharya, senior director of accelerated computing products at Nvidia, told VentureBeat. "Just like how LLMs can understand the human language, they've also been trained to understand the language of biology and chemistry."

Why LLMs matter

LLMs are based on AI transformer architecture and are widely used to support a growing number of use cases.

Kharya explained that with a transformer, the AI model can understand which parts of a sentence, an image or even very disparate data points are relevant to each other. Unlike convolutional neural networks (CNNs), which typically look at only the immediate neighboring relationships, transformers are designed to train on more distant relationships as well, which Kharya said is very important for use cases like natural language processing (NLP).

"Transformers also enable us to train on unlabeled datasets, and that greatly expands the volume of data," he said. "We are really seeing an explosion of research, applying transformer models to all kinds of use cases this year. We are expected to have 11,000 papers on transformers, actually seven times more than five years ago."

The GPT-3 LLM has helped to increase awareness and adoption of LLMs for a variety of use cases, including summation and text generation. An LLM is also at the foundation of the DALL-E text-to-image generation technology.

"Today, we are seeing LLMs being applied to predict protein structures from sequences of amino acids or for understanding and generating art by learning the relationship between pixels," Kharya said.

Prompt learning and the need for context with LLMs

As with any type of AI model, context matters. What might make sense for one audience or use case will not be appropriate for another. Training entirely new LLMs for every type of use case is a time-consuming process.

Kharya said that an emerging approach of providing context to LLMs for specific use cases is a technique known as prompt learning. He explained that with prompt learning, a companion model is trained that learns to provide the context to the pretrained large language model, using what's called a prompt token.

The companion model can learn the context by using as few as 100 examples of queries with the right responses. At the end of the prompt learning training, a token is generated that can then be used together with the query, which will provide the context required from the LLM.

What the Nvidia NeMo LLM Service enables

The new NeMo LLM Service is an effort to make it easier to enable customization and inference of giant AI models.

The giant AI models that the service will support include a 5 billion- and a 20 billion-parameter GPT-based model, as well as one based on the Megatron 530-billion parameter LLM. As part of the service, Nvidia is also supporting prompt learning–based tuning to rapidly enable context-specific use cases. Kharya said that the NeMo LLM Service will also include the option to use both ready-made models and custom models through a cloud-based API experience.

Going a step further, Nvidia is also launching a specific LLM capability for life sciences with the BioNeMo Service.

"Just like how an LLM can understand the human language, they've also been trained to understand the language of biology and chemistry," Kharya said.

Kharya said that, for example, DNA is the language basically written in the alphabet of nucleic acid and the language of protein structures is written in the alphabet of amino acids.

Overall the goal with the new LLM services is to further expand the use of AI.

"The promises and possibilities are really immense and it's the access to large language models and the ability to customize them easily that was not there before," Kharya said. "So what the NeMo Large Language Model Service does is it removes that barrier and it now enables everyone to access and experiment with [LLMs] for their use cases."

Why LLMs matter

Prompt learning and the need for context with LLMs

What the Nvidia NeMo LLM Service enables

More