AMD’s new AI accelerator: Instinct MI325X has 1000 W consumption and more memory than Nvidia’s Blackwell

A week ago, AMD released the new Epyc 9005 server processors with the Zen 5 architecture, with which up to 192-core models based on the Zen 5c cores, which this time are part of the same “Turin” processor family, went on the market at the same time. During this data center-focused “event”, the company also introduced new artificial intelligence accelerators: GPU Instinct Mi325X. It was announced in the summer and now officially goes on sale, but with different parameters than it should have.

AMD Instinct 325X

The Instinct 325X is a kind of refresh of the Instinct MI300X accelerator released a year ago, which managed to finally take a part of the AI ​​market (AMD has already spent a few billion dollars on it). It is a computing GPU with CDNA 3 architecture, which has an advanced chiplet construction with an underlying 6nm system layer and 5nm computing chiplets placed on it.

The Instinct MI325X model contains 304 CUs (19,456 stream processors) with 256MB of Infinity Cache. AMD states a frequency of 2100 MHz for it – this is the “peak engine clock”, so it is probably appropriate to talk about the boost frequency. The GPU achieves a performance of 81.7 TFLOPS in FP64 double-precision calculations for scientific computing. In AI acceleration, the GPU achieves up to 1.305 PFLOPS in FP16 calculations and 2.61 PFLOPS in 8-bit precision calculations (FP8, INT8). When using the sparsity function, the performance is doubled (that is, up to 5.22 PFLOPS in INT8/FP8).

These specifications are the same as the MI300X. The new MI325X version differs in terms of memory. While the MI300X has 192GB of HBM3 memory clocked at 5.2GHz (effective), the Instinct MI325X is the first AMD product to use HBM3E memory that runs at 6GHz (effective). This increases memory throughput from 5.3 TB/s to 6 TB/s, as the accelerator has an 8192-bit bus.

But the important thing is that with the HBM3E the memory capacity also increases. AMD originally announced that it would use HBM3E with a capacity of 36GB per case (these should be 12-layer “stacks” using 24Gb NAND chips), which would give the accelerator 288GB of memory. Allegedly, there was a problem with the availability of this memory, so in the end it is not used in the Instinct MI325X. Instead, HBM3Es with a capacity of 32 GB per case are used – perhaps 16-layer stacks with 16 Gb chips.

Thanks to this, the Instinct MI325X finally provides 256 GB of HBM3 memory. So it has a lot more memory than the last version of the Nvidia H200 accelerator of the Hopper generation (refresh with HBM3E has a capacity of 141 GB). It is interesting that it will even have a higher capacity than Nvidia has for the B200 accelerator in the new generation Blackwell, which is supposed to have 192 GB of HBM3E memory. In some cases of AI applications, capacity can be the biggest limitation, because it determines how massive a model can be worked with, and advances in AI, especially in so-called large language models, are connected precisely with the number of parameters, and therefore the data “size” of the model. This, at least in theory, could mean that the MI325X will remain competitive even after the release of the Blackwell, although it will probably have a higher gross performance. However, Nvidia will probably release another refresh of Blackwell, which will increase the memory capacity.

The entire accelerator is manufactured in the Open Compute Accelerator Module (OAM) mezzanine design, although it uses a PCIe 5.0 ×16 connection. The module has too much power for a PCI Express card, with a TDP of up to 1000 W. But at the same time, it also provides coherent Infinity Fabric Link lines to connect with other GPUs and CPUs in the system. The GPU has eight of these lines, one of which is supposed to have a throughput of 128 GB/s.

8X instances with AMD Instinct MI325X accelerators in OAM design

Autor: AMD

For the original version of the MI300X, AMD quoted a peak consumption of 750 W, and it doesn’t seem that the difference against the 1000 W of the 325X should always be consumed entirely by the HBM3E memories. It is therefore possible that even with the same specified maximum boost frequencies, the raw performance will also increase slightly, as the GPU will be able to run at full loads at higher clocks. Better frequency is also mentioned by the website ServeTheHome, so this hidden upgrade probably really happened. This, along with higher memory bandwidth, promises better performance than the MI300X.

AMD says that the Instinct MI325X will start shipping commercially in the current quarter (ie by the end of the year), and should be widely available in Q1 2025. It is expected to be offered by server manufacturers HP, Supermicro, Gigabyte, Dell, Lenovo and Eviden (and reportedly more) . Along with these accelerators, AMD should also start selling the “AI” smartNIC Pollara 400 and DPU Salina, which are products of the Pensando division – originally a separate company that AMD bought in 2022.

Instinct MI355X

Instinct MI355X

Autor: AMD, via: VideoCardz

Instinct MI355X next year with CDNA 4

AMD also said that the next generation of its AI accelerators should be available in a year in the second half of 2025, which is supposed to be Instinct MI355X (originally in the summer, AMD listed the MI350X designation, so there was also a change here). The Instinct MI355X will get the 36GB of HBM3E memory that the MI325X was originally supposed to have, increasing the memory capacity to 288GB. The MI350X will really be a new generation, and not just a refresh, because it is supposed to use a new architecture labeled CDNA 4.

Roadmap of AMD Instinct AI accelerators and CDNA architectures

Roadmap of AMD Instinct AI accelerators and CDNA architectures

Autor: AMD, via: VideoCardz

The next step should be a completely new generation MI400Xwhich will also bring a new architecture. So far, AMD calls it CDNA Next, but it is possible that it will eventually be labeled UDNA, if the architectures are unified with the Radeon GPU gaming line, which the company recently announced. This product will be on the market, or at least officially revealed, supposedly in 2026.

Sources: AMD (1, 2), ServeTheHome, VideoCardz

Source: www.cnews.cz