TL;DR
- 800 Gb/s Ethernet was standardised by IEEE 802.3df (ratified 2024) defining 800GBASE-R as eight lanes of 100 Gb/s PAM4 or four lanes of 200 Gb/s PAM4.
- Implemented by Broadcom Tomahawk 5, NVIDIA Spectrum-X (SN5600), Cisco Silicon One G200, and Marvell Teralynx 10.
- Pairs with RoCEv2 and PFC/ECN/DCQCN to provide near-lossless behaviour suitable for GPU-cluster fabrics.
- Connectors: OSFP and QSFP-DD800; transceivers include 800G-DR8, 800G-FR4, 800G-2xFR4, and linear pluggable optics (LPO).
Overview#
800G Ethernet is the IEEE 802.3df standard for 800 Gb/s line-rate Ethernet, ratified in 2024 as an amendment to 802.3-2022. The standard defines several PHYs under the 800GBASE-R umbrella: implementations may use 8 × 100 Gb/s PAM4 lanes (common with current 100G SerDes generations) or 4 × 200 Gb/s PAM4 lanes (the path forward as 200G SerDes ships in volume).
From an AI-fabric perspective, 800G Ethernet matters because it is the per-port bandwidth at which lossless Ethernet credibly competes with InfiniBand for training fabrics. Below 400G, Ethernet was the access fabric and InfiniBand was the training fabric; at 800G, the choice is genuinely a trade-off.
Specifications#
| Property | Value |
|---|---|
| Standard | IEEE 802.3df (2024) |
| Per-port line rate | 800 Gb/s |
| PHY options | 800GBASE-R: 8×100G PAM4 or 4×200G PAM4 |
| FEC | RS(544,514) — KP4 |
| Connectors | OSFP, QSFP-DD800 |
| Typical reach | DR8: 500 m; FR4: 2 km; LR4: 10 km |
| Switch ASICs | Tomahawk 5, Spectrum-X SN5600, Silicon One G200, Teralynx 10 |
Loss Versus Lossless#
Ethernet is fundamentally a lossy transport — packets may be dropped when a switch buffer overflows. For RDMA over Ethernet (RoCEv2) to work at scale, the network must approximate losslessness. This is achieved through Priority Flow Control (PFC), which pauses traffic on a per-priority basis when buffers fill, and Explicit Congestion Notification (ECN) marked under DCQCN to throttle senders before queues build.
Tuning these mechanisms is the operational tax that Ethernet pays versus InfiniBand. The reward is a familiar tool chain, lower per-port cost, and broader vendor choice.
Operational Notes#
- LPO (linear pluggable optics) cut transceiver power roughly in half versus DSP-based optics — but tighten link-budget tolerances and may not be available across all reaches.
- Buffer sizing matters: deeper switch buffers (Jericho-class) absorb more incast; shallow-buffer (Tomahawk-class) requires tight ECN tuning.
- PFC pause storms remain the most common Ethernet AI-fabric incident — instrument PFC pause counts per port and per priority in production.
- Layer-2-only AI pods avoid IP/BGP complexity; larger fabrics use EVPN-VXLAN with multi-stage fat trees.
References
- IEEE 802.3df-2024 Standard · IEEE
- Broadcom Tomahawk 5 Product Brief · Broadcom
- NVIDIA Spectrum-X Platform · NVIDIA