NVIDIA announced a new class of large-memory AI supercomputer — an NVIDIA DGX™supercomputer powered by NVIDIA® GH200 Grace Hopper Superchips and the NVIDIA NVLink® Switch System — createdto enable the development of giant, next-generation models for generative AI language applications, recommender systemsand data analytics workloads.
The NVIDIA DGX GH200’s massive shared memory space uses NVLink interconnect technology with the NVLink SwitchSystem to combine 256 GH200 superchips, allowing them to perform as a single GPU. This provides 1 exaflop ofperformance and 144 terabytes of shared memory — nearly 500x more memory than the previous generation NVIDIA DGX A100, which was introduced in 2020.
“Generative AI, large language models and recommender systems are the digital engines of the modern economy,” said Jensen Huang, founder and CEO of NVIDIA. “DGX GH200 AI supercomputers integrate NVIDIA’s most advanced accelerated computing and networking technologies to expand the frontier of AI.”
NVIDIA NVLink Technology Expands AI at Scale
GH200 superchips eliminate the need for a traditional CPU-to-GPU PCIe connection by combining an Arm-based NVIDIAGrace™ CPU with an NVIDIA H100 Tensor Core GPU in the same package, using NVIDIA NVLink-C2C chip interconnects.This increases the bandwidth between GPU and CPU by 7x compared with the latest PCIe technology, slashes interconnectpower consumption by more than 5x, and provides a 600GB Hopper architecture GPU building block for DGX GH200supercomputers.
DGX GH200 is the first supercomputer to pair Grace Hopper Superchips with the NVIDIA NVLink Switch System, a newinterconnect that enables all GPUs in a DGX GH200 system to work together as one. The previous-generation system onlyprovided for eight GPUs to be combined with NVLink as one GPU without compromising performance.
The DGX GH200 architecture provides 48x more NVLink bandwidth than the previous generation, delivering the power of amassive AI supercomputer with the simplicity of programming a single GPU.
A New Research Tool for AI Pioneers
Google Cloud, Meta and Microsoft are among the first expected to gain access to the DGX GH200 to explore its capabilitiesfor generative AI workloads. NVIDIA also intends to provide the DGX GH200 design as a blueprint to cloud service providersand other hyper scalers so they can further customize it for their infrastructure.
New NVIDIA Helios Supercomputer to Advance Research and Development
NVIDIA is building its own DGX GH200-based AI supercomputer to power the work of its researchers and developmentteams.
Named NVIDIA Helios, the supercomputer will feature four DGX GH200 systems. Each will be interconnected with NVIDIAQuantum-2 InfiniBand networking to supercharge data throughput for training large AI models. Helios will include 1,024Grace Hopper Superchips and is expected to come online by the end of the year.
Fully Integrated and Purpose-Built for Giant Models
DGX GH200 supercomputers include NVIDIA software to provide a turnkey, full-stack solution for the largest AI and dataanalytics workloads. NVIDIA Base Command™ software provides AI workflow management, enterprise-grade clustermanagement, libraries that accelerate compute, storage and network infrastructure, and system software optimized forrunning AI workloads.
Also included is NVIDIA AI Enterprise, the software layer of the NVIDIA AI platform. It provides over 100 frameworks,pre-trained models and development tools to streamline the development and deployment of production AI including generativeAI, computer vision, speech AI and more.
Availability
NVIDIA DGX GH200 supercomputers are expected to be available by the end of the year.
Disclaimer:The information contained in each press release posted on this site was factually accurate on the date it was issued. While these press releases and other materials remain on the Company's website, the Company assumes no duty to update the information to reflect subsequent developments. Consequently, readers of the press releases and other materials should not rely upon the information as current or accurate after their issuance dates.