AMD Tech Day 2024: Ryzen AI, Zen 5, XDNA 2 ja RDNA 3.5

According to AMD, the Zen 5 architecture offers an average of 16% better IPC compared to Zen 4, while XDNA 2 up to five times the performance compared to the last generation.

AMD has held its Tech Day 2024 event and shared more information about its upcoming Zen 5 processors. We already covered the rumors of the Ryzen 9000 series before, and in this news article we focus on the innovations of the Zen 5, XDNA 2 and RDNA 3.5 architectures and take a look at what artificial intelligence brings to the Ryzen AI series in practice.

According to AMD, the Zen 5 architecture offers an average of 16% better IPC than the Zen 4 architecture. Keys to additional performance include improved branch prediction and two instruction decoding units. The delays of the instruction cache have also been reduced and the bandwidth increased. On the execution side, the unit sending instructions for execution is now 8 wide (previously 6), there are six ALU units and three multipliers. The L1 data cache is now 48 KB instead of the former 32, and the bandwidth between the FPU unit and the L1d cache has been increased up to twice. Perhaps the most significant big change is precisely on the FPU side, where the AVX512 is now natively supported with a 512-bit floating point array, while Zen 4 still had 256-bit FPUs. Although the average IPC has increased by 16%, improvements of up to 32% can be expected in single-core machine learning tasks and up to 35% in single-core AES-XTS calculations.

The new NPU artificial intelligence accelerator of the XDNA 2 architecture is integrated into the Strix Point notebook processors. The basis of XDNA is the technologies obtained from Xilinx, which have been further developed after the acquisitions. The second generation XDNA architecture is naturally even faster and also expands the supported formats with the new Block FP16, which offers almost INT/FP8 level performance with FP16 level accuracy. Intel’s upcoming Lunar Lake will also support the same resolution. The computing units of the architecture, i.e. AI Engines, are now equipped with two MAC units instead of one and they now use 60% more local memory than before. XDNA 2’s 25 AIE cores are interconnected and the traffic between them is controlled by programmable gateways, which is promised to reduce the need for memory bandwidth and enable certain resources to be locked to certain AIE cores. The memory of AIE cores can be managed programmatically and they are promised to be free of cache bugs. Updates to the architecture allow the NPU to be configured for different needs while being guaranteed to provide a certain performance in real time. Overall, it offers five times the performance and twice the energy efficiency of the first generation XDNA NPU accelerator.

The graphics controller has also undergone updates and is now based on the new RDNA 3.5 architecture. The Strix Point iGPU used in the Ryzen AI series has increased by 33% to 16 Compute Units, but performance has also been gained elsewhere. At the architectural level, the new RDNA version has focused especially on improving energy efficiency. It has been achieved by optimizing the most common functions, such as enabling the texturing unit to double the speed of certain most common functions, doubling the speed of interpolation and comparison operations, and improving memory management. Memory searches should now occur less often, while better compression methods and optimizations are used than before, specifically for LPDDR5 memories. In practical tests, according to AMD’s own numbers, Strix Point offers 32% better performance in 3DMark Time Spy and 19% better performance in Night Raid with the same 15 watt TDP than Hawk Point, i.e. the Ryzen 8040 series.

AMD’s Ryzen AI processors for laptops meet and exceed the 40 TOPS limit set by Microsoft for NPU accelerator performance. In addition to the built-in artificial intelligence functions of Windows, the company works closely with device manufacturers to bring users more artificial intelligence experiences. With Acer LiveArt, the user can take a picture of a certain pose and create new pictures that repeat the same pose of the character. Asus’ StoryCube, on the other hand, is said to assist the user in managing files, while HP’s AI Companion promises to optimize the device’s features in the name of better productivity. AMUSE 2.0, on the other hand, offers image creation that benefits from Stable Diffusion and can take both text and a drawn sketch of the desired image as a source.

Source: AMD press materials

Source: www.io-tech.fi