NVIDIA Blackwell Sweeps MLPerf Training 6.0

NVIDIA announced that its Blackwell platform achieved the best results on all seven MLPerf Training 6.0 benchmarks, cementing dominance in large-model training. GB300 NVL72 delivers up to 1.6× faster training than GB200 NVL72, and the largest submission used 8,192 Blackwell GPUs on the DeepSeek-V3 model with 671 billion parameters. CoreWeave trained DeepSeek-V3 671B in 2.02 minutes on 8,192 GPUs, while Microsoft Azure completed Llama 3.1 405B in 7.07 minutes.

NVIDIA announced results in MLPerf Training 6.0 in which its Blackwell platform achieved the best results on all seven benchmarks, cementing its dominance in large-model training.

What did NVIDIA achieve in MLPerf Training 6.0?

NVIDIA is the only platform with submissions on all seven benchmarks, including two new MoE (Mixture of Experts) pre-training tasks. MoE is an architecture where only a fraction of parameters are activated per token. The new generation GB300 NVL72 delivers up to 1.6× faster training than the previous GB200 NVL72. MLPerf Training is an industry-standard benchmark suite that measures the time required to train a model to a target accuracy.

What were the results at the largest scale?

The largest submission used 8,192 Blackwell GPUs on the DeepSeek-V3 model with 671 billion parameters. At that scale, CoreWeave trained DeepSeek-V3 671B in 2.02 minutes, while Microsoft Azure completed Llama 3.1 405B in 7.07 minutes. The results demonstrate how dramatically training times for frontier models have shrunk on massive GPU clusters.

Why do these results matter?

MLPerf results serve as a neutral reference for comparing AI hardware, so the announcement influences procurement decisions in data centers. A sweep of all seven tests, including the new MoE tasks, signals that NVIDIA is maintaining its edge precisely on the architectures powering the latest frontier models.

Frequently Asked Questions

What did NVIDIA achieve in MLPerf Training 6.0?

Best results on all seven benchmarks; GB300 NVL72 delivers up to 1.6× faster training than GB200 NVL72.

How many GPUs did the largest submission use?

8,192 Blackwell GPUs on the DeepSeek-V3 model with 671 billion parameters.

What is MLPerf Training?

An industry-standard benchmark suite measuring AI model training speed across a set of standardized tasks.

NVIDIA: Blackwell Sweeps MLPerf Training 6.0 — Fastest on All 7 Benchmarks, GB300 Up to 1.6× Faster

What did NVIDIA achieve in MLPerf Training 6.0?

What were the results at the largest scale?

Why do these results matter?

Frequently Asked Questions

Sources

Related news