TL;DR
- BlueField-4 is NVIDIA's fourth-generation DPU, doubling network bandwidth to 800 Gb/s and approximately tripling ARM core count over BlueField-3.
- Targets the Blackwell-era reference architectures (GB200/GB300) where the infrastructure plane must keep up with XDR InfiniBand and 800G Ethernet endpoints.
- Continues the DOCA software stack and extends accelerators for confidential compute, packet processing, and storage offload.
- Pairs with Spectrum-X 800G fabrics and Quantum-3 InfiniBand on the endpoint side; broadly available 2025-2026.
Overview#
BlueField-4 is the next step in NVIDIA's DPU roadmap, designed to match the bandwidth and core demands of the Blackwell generation. The headline numbers — 800 Gb/s of network bandwidth and approximately 64 ARM cores — reflect a recognition that infrastructure work scales with both fabric speed and tenant complexity.
Where BlueField-3 had to choose carefully which infrastructure services could fit on its 22 ARM cores, BlueField-4 can host substantially more concurrent work: storage initiators, observability sidecars, security inspectors, and tenant-isolation enforcement can all run simultaneously without contending for cycles.
Generational Comparison#
| Property | BlueField-3 | BlueField-4 |
|---|---|---|
| Network bandwidth | 400 Gb/s | 800 Gb/s |
| ARM cores | 22 × Cortex-A78 | ~64 ARM cores |
| Host PCIe | Gen5 x16 | Gen5/Gen6 (next-gen) |
| Memory | Up to 32 GB DDR5 | Larger DDR5 capacity |
| First volume | 2023 | 2025-2026 |
| Targeted GPU generation | Hopper / H200 | Blackwell / GB200/GB300 |
Architectural Direction#
The headroom from doubling network bandwidth and approximately tripling compute is being spent on three classes of workload. First, line-rate cryptography at 800G — IPsec, TLS termination, MACsec — that earlier generations could not sustain without traffic-shaping. Second, richer observability — in-line per-flow telemetry without sampling, sustained at line rate. Third, more capable confidential-compute boundaries: full attested infrastructure pipelines that can be audited independently of host operating systems.
Operational Notes#
- DOCA APIs remain source-compatible across BlueField generations; new accelerators surface as opt-in libraries.
- Higher core count enables co-locating storage initiator and observability sidecars without dedicated CPU sets.
- Power: roughly doubles versus BlueField-3 at peak — server-side power budgets need to be re-validated.
- Reference architectures are in flux during 2025-2026; consult the latest NVIDIA documentation for current configurations.
Specific numbers for BlueField-4 (exact core count, DDR speed, accelerator inventory) continue to evolve. Treat any figure here as approximate and verify against current NVIDIA documentation before sizing a deployment.
References
- NVIDIA BlueField Roadmap (GTC) · NVIDIA
- DOCA SDK Documentation · NVIDIA
- NVIDIA Networking Whitepapers · NVIDIA