Nvidia's HGX-2 uses 16 GPUs for fast AI training

Nvidia today unveiled HGX-2, a cloud server platform equipped with 16 Tesla V100 graphics processing unit (GPU) chips that collectively provide half a terabyte of GPU memory and two petaflops of compute power. The GPUs work together through the use of NVSwitch interconnects. The HGX-2 motherboard is made to handle both training AI models and high performance computing.

The HGX-2 has achieved what Nvidia believes are record AI training speeds. According to a company statement, the GPU server can process 15,500 images per second on the ResNet-50 training benchmark and is able to replace up to 300 CPU-only servers.

The announcement was made onstage today at Nvidia's GTC Taiwan event. Server makers like Lenovo and manufacturers like Foxconn plan to bring HGX-2-based systems to market later this year.

The first system built using HGX-2 was the DGX-2, which made its debut at GTC in March. At the time, the company reported that hardware and software improvements to its deep learning computing platform boosted performance on deep learning workloads by 10 times in the span of six months.

As businesses, researchers, and others increase their use of AI solutions, the amount of computing power used to train notable neural networks is also on the rise. A recent OpenAI study found that the compute power needed to train well-known systems has doubled once every 3.5 months since 2012. GPUs like the kind Nvidia makes are increasingly utilized for training and deploying unlabeled data, like photos and videos.

“CPU scaling has slowed at a time when computing demand is skyrocketing," CEO Jensen Huang said in a statement provided to VentureBeat. "Nvidia's HGX-2 with Tensor Core GPUs gives the industry a powerful, versatile computing platform that fuses HPC and AI to solve the world’s grand challenges."

Also at the gathering, Nvidia introduced different classes of servers to map GPU accelerated servers to different specific workloads, including the HGX-T for AI training, HGX-I for inferencing, and SCX, a super computing class of servers. Each comes with differing combinations of GPU and CPU ratios to optimize the performance of specific tasks.

The introduction of the HGX-2 follows last year's release of the HGX-1, which is powered by eight GPUs. The HGX-1 reference architecture has been utilized in GPU servers like Facebook’s Big Basin and Microsoft's Project Olympus.

More