NVIDIA’s powerful GB200 NVL72 AI supercomputer is now achieving even greater efficiency through enhanced topology-aware scheduling powered by Slurm, marking a major advancement for large-scale AI and high-performance computing workloads.
Built on NVIDIA’s next-generation Blackwell architecture, the GB200 NVL72 is designed to handle the growing demands of massive AI models, including trillion-parameter large language models (LLMs). The system delivers up to 130 terabytes per second of GPU communication bandwidth while supporting both high-speed AI training and real-time inference workloads.
At the core of the platform are 72 NVIDIA Blackwell GPUs and 36 Grace CPUs integrated into a single rack and connected using NVIDIA NVLink technology. NVIDIA says the system can process more than 1.5 million tokens per second for OpenAI GPT-based models, making it one of the most advanced AI computing platforms currently available.
However, extracting maximum performance from these large-scale systems requires intelligent workload management, especially in shared AI clusters. To address this challenge, NVIDIA partnered with SchedMD to improve Slurm’s topology-aware scheduling capabilities.
Why Advanced Scheduling Matters
AI workloads often operate within massive shared computing environments where multiple jobs compete for limited resources. Without optimized scheduling, tasks may spread inefficiently across GPU communication domains, reducing performance and increasing resource fragmentation.
The new Slurm topology/block plugin solves this issue by aligning workloads with the physical network layout of the GB200 NVL72 system. This approach preserves GPU locality, minimizes communication overhead, and improves overall cluster efficiency.
According to NVIDIA simulations conducted on a 5,000-node GB200 NVL72 cluster, the updated scheduling system achieved GPU utilization levels within 1% of theoretical maximum capacity while maintaining strong job performance.
The scheduler also intelligently allocates smaller workloads in ways that leave room for larger AI training tasks, helping operators balance cluster efficiency with high-performance computing demands.
Flexible Segment Sizing Improves Efficiency
One of the standout capabilities of the GB200 NVL72 platform is support for larger and more flexible segment sizing.
Previous systems such as the NVIDIA HGX H100 were limited to single-node segment configurations, but the GB200 NVL72 can support segments of up to 18 nodes. This flexibility allows operators to optimize configurations for specific AI workloads.
For example, NVIDIA recommends using 16-node segments for bandwidth-intensive tasks like mixture-of-experts (MoE) model training, while smaller jobs can run efficiently within single-node segments.
These optimized configurations help prevent scheduler bottlenecks while maintaining high utilization rates even as workload patterns evolve over time.
Rising Demand For AI Infrastructure
Commercial deployments of the GB200 NVL72 began expanding rapidly throughout 2025. Initial pricing for the system ranged between $2.8 million and $3.4 million per rack, but reports suggest fully configured systems now cost as much as $8.8 million due to surging global demand for advanced AI infrastructure.
NVIDIA’s data center business continues to reflect this explosive growth. The company recently reported $39.1 billion in data center revenue during Q1 FY26, highlighting the growing importance of AI and high-performance computing systems like the GB200 NVL72.
Meanwhile, NVIDIA stock (NASDAQ: NVDA) is trading around $221.42, giving the company a market capitalization of approximately $5.40 trillion.
The Future Of Exascale AI Computing
The GB200 NVL72 represents a major leap forward in AI supercomputing, but NVIDIA emphasizes that software optimization is just as important as hardware performance.
Its collaboration with SchedMD demonstrates how intelligent scheduling systems can unlock the full potential of next-generation AI infrastructure by improving resource allocation, reducing inefficiencies, and maximizing compute performance.
As AI models continue growing in complexity and size, systems like the GB200 NVL72 are expected to become foundational to the future of large-scale AI training and inference.
With continued advancements in workload scheduling, GPU interconnect technology, and cluster management, the era of exascale AI computing is rapidly becoming a reality.

