Intel Data Center GPU Max Series (Ponte Vecchio)

TL;DR

Intel's Xe-HPC architecture data centre GPU launched January 2023 — 47 chiplets across multiple process nodes.
Powered the Aurora supercomputer at Argonne National Laboratory.
128 GB HBM2e on the flagship Max 1550 model with ~3.2 TB/s bandwidth.
Commercial reach was limited; Intel announced in 2024 the unification of Max and Gaudi roadmaps under Falcon Shores.

Overview#

Intel Data Center GPU Max — code-named Ponte Vecchio — was Intel's first true data centre GPU. Launched January 2023, it pushed chiplet integration further than any contemporary accelerator: 47 tiles spanning Intel 7, TSMC N7 and TSMC N5 processes, integrated via Foveros 3D stacking and EMIB bridges.

The headline deployment is Aurora, the exascale-class supercomputer at Argonne National Laboratory. Commercial reach beyond HPC was limited; Intel announced through 2024 that the Max and Gaudi product lines would unify under a future Falcon Shores design.

Specifications#

Metric	Max 1550 (OAM)
Architecture	Xe-HPC (Ponte Vecchio)
Tiles / chiplets	47
FP64 (matrix)	52 TFLOPS
FP32	52 TFLOPS
BF16 / FP16 (matrix)	839 TFLOPS
INT8 (matrix)	1,678 TOPS
Memory	128 GB HBM2e
Memory bandwidth	~3.2 TB/s
TDP	600 W
Form factor	OAM

Architecture Notes#

Ponte Vecchio's chiplet integration was its main engineering achievement: Compute Tiles, Rambo Cache, Base Tiles, HBM stacks and Xe Link tiles all assembled into a single package. This made the part technically impressive but commercially expensive — chiplet integration costs at this scale were not justified by the volume Intel ultimately sold.

Programming targets Intel's oneAPI / SYCL stack. OpenMP target offload and OpenCL paths also work but receive less optimisation attention than oneAPI.

When Max Mattered#

HPC workloads — particularly traditional FP64 simulation — at Aurora and similar exascale-class deployments.
Workloads already invested in oneAPI / SYCL portability across Intel CPUs and GPUs.
By 2026 — most production AI workloads should target NVIDIA, AMD or Gaudi rather than Max.

Pitfalls#

Limited commercial roadmap — Intel folded Max and Gaudi under Falcon Shores; long-term support trajectory is unclear.
Software ecosystem reach is narrow; most AI frameworks treat Max as a tertiary target.
Chiplet design is genuinely complex to schedule for — naive code maps poorly to tile boundaries.

Software Notes#

oneAPI Base Toolkit, Intel oneMKL, Intel Extension for PyTorch and SYCL provide the primary developer paths. Intel maintained Hugging Face Optimum integration for Max through 2024.

References

Intel Data Center GPU Max Series · Intel
Aurora at Argonne · Argonne National Laboratory

Overview#

Metric

Max 1550 (OAM)

Architecture

Xe-HPC (Ponte Vecchio)

Tiles / chiplets

FP64 (matrix)

52 TFLOPS

FP32

52 TFLOPS

BF16 / FP16 (matrix)

839 TFLOPS

INT8 (matrix)

1,678 TOPS

Memory

128 GB HBM2e

Memory bandwidth

~3.2 TB/s

TDP

600 W

Form factor

OAM

Architecture Notes#

Programming targets Intel's oneAPI / SYCL stack. OpenMP target offload and OpenCL paths also work but receive less optimisation attention than oneAPI.

Pitfalls#

Limited commercial roadmap — Intel folded Max and Gaudi under Falcon Shores; long-term support trajectory is unclear.

Software ecosystem reach is narrow; most AI frameworks treat Max as a tertiary target.

Chiplet design is genuinely complex to schedule for — naive code maps poorly to tile boundaries.

Intel Data Center GPU Max Series (Ponte Vecchio)

Overview#

Specifications#

Architecture Notes#

When Max Mattered#

Pitfalls#

Software Notes#

References

Browse all entries

Deploy on Yobitel

Intel Data Center GPU Max Series (Ponte Vecchio)

Overview#

Specifications#

Architecture Notes#

When Max Mattered#

Pitfalls#

Software Notes#

References

Browse all entries

Deploy on Yobitel