TL;DR
- Quantum-3 is NVIDIA's XDR-generation InfiniBand platform, doubling per-port bandwidth to 800 Gb/s and including a Quantum-3 InfiniBand Director for very large fabrics.
- Adds SHARPv4 in-network reduction with deeper trees and BF16/FP8 support, plus higher radix to flatten large fat-tree topologies into fewer tiers.
- Reference fabric for GB200 NVL72 and GB300 NVL72 reference architectures, paired with ConnectX-8 SuperNICs.
- Provides ~115 Tb/s aggregate switching capacity per ASIC generation; first appliances shipped 2024-2025.
Overview#
Quantum-3 is the InfiniBand switch generation that arrived with Blackwell. It doubles per-port bandwidth from 400 to 800 Gb/s, extends in-network reduction with SHARPv4, and adds a high-radix Director variant aimed specifically at fabrics that need to span tens of thousands of GPUs in two or three tiers rather than four.
The platform was announced at GTC 2024 and is the reference fabric for the GB200 NVL72 and the GB300 NVL72 designs. It does not replace Quantum-2 in the field — most NDR fabrics remain in production — but is the default for new XDR-era builds.
Specifications#
| Property | Value |
|---|---|
| Generation | XDR (eXtreme Data Rate) |
| Per-port line rate | 800 Gb/s |
| In-network reduction | SHARPv4 |
| Connector | OSFP-XD |
| Switch form factors | Fixed 1U, modular Director |
| Host adapter | ConnectX-8 SuperNIC |
| Adaptive routing | Yes, with SHIELD self-healing |
| First shipments | 2024-2025 |
Operational Notes#
- Director chassis can collapse a three-tier fat tree into two tiers for clusters under ~16k endpoints — fewer cables, lower latency, simpler cabling plan.
- Linear pluggable optics (LPO) are commonly used to reduce power and cost on intra-rack runs.
- UFM 6.x recommended for telemetry and topology management at XDR scale.
- Quantum-3 is interoperable with NDR endpoints but caps at NDR rates on those links — segregate where possible.
Enable SHARPv4 explicitly in NCCL via `NCCL_COLLNET_ENABLE=1` and verify with `nccl-tests` AllReduce — speed-up versus host-side reduction is most visible at message sizes above 64 MB.
References
- NVIDIA Quantum-3 InfiniBand Platform · NVIDIA
- GB200 NVL72 Reference Architecture · NVIDIA
- InfiniBand Architecture Specification · InfiniBand Trade Association