Nvidia unleashes H100, its fastest AI GPU yet, across clouds and vendors

Nvidia’s H100 Hopper GPUs, which promise to revolutionize artificial intelligence (AI) with unprecedented speed and power, are now widely available to customers across various platforms, the company announced on Tuesday at its annual developer conference.

The H100 is the successor to Nvidia’s A100 GPUs, which have been at the foundation of modern large language model development efforts. According to Nvidia, the H100 is up to nine times faster for AI training and 30 times faster for inference than the A100.

The new GPU benefits from a built-in Transformer Engine, which is critical to the development of generative AI models such as GPT-3. It also features dynamic programming instructions (DPX), which help accelerate the execution of code.

"All the major OEMs have H100 server solutions to accelerate large language model training, and all the major cloud providers have been busy announcing their H100 instances," said Ian Buck, VP of Hyperscale and HPC at Nvidia, during a press conference. "We are super excited that H100 systems are now in full production and now available to customers."

The Hopper lineup of OEMs, clouds

Hyperscalers and cloud providers have all made announcements in support of H100.

Buck noted that just last week, Microsoft announced its private preview of the H100 Nvidia instances. The new instances will help power both the next-generation OpenAI models and Nvidia's own models to enable a new class of large-scale AI solutions. Back in November 2022, Microsoft and Nvidia expanded their partnership to build out an AI supercomputer in the cloud, which in the future will make extensive use of the H100.

Buck also noted that Amazon will be announcing the Amazon Web Services (AWS) EC2 UltraClusters of p5 instances, which are based on H100. Buck said that the p5 instances can scale up to 20,000 GPUs using AWS's Elastic Fabric Adapter (EFA) technology.

Additionally, Buck said that the tech giant Meta is now starting to deploy its "Grand Teton" H100 systems into its data centers to build Meta's next AI supercomputer.

A slide Buck displayed during his press conference noted that the many partners now going live with the H100. Among the vendors listed were Alibaba Cloud, Baidu Cloud, Cisco, Dell, Fujitsu, Gigabyte, Hewlett Packard Enterprise, Lenovo, Supermicro and Vultr.

What comes after the H100? More inference

GPUs can be deployed for training new models as well as for inference.

"Training is the first step — teaching a neural network model how to perform a task, answer a question or generate a picture," Buck said. "Inference is the deployment of those models in production in real-life use cases."

To help support the wider deployment of the inference capabilities, Nvidia announced its new L4 GPU. Buck explained that the L4 is a universal accelerator for efficient video, AI and graphics. Nvidia already has an early adopter for the L4: Google Cloud. Google will be integrating the L4 into the Vertex AI platform as well as providing its users direct access via the new G2 virtual compute instances.

_{Nvidia L4 GPU. Image credit: Nvidia.}

"It's a simple single slot, Low Profile GPU that could fit in any server, turning any server or any data center into an AI data center," Buck said. "This GPU is 120 times faster than a traditional CPU server and uses 99% less energy."