According to information, the new Blackwell processors from Nvidia are overheating in the servers. So the company has to rebuild the racks and adjust the cooling. However, the operation of servers for customers will be delayed.
Nvidia is addressing overheating issues with its new generation Blackwell GPUs that are affecting their deployment in servers. According to sources, processors overheat when running in high-density servers consuming up to 120 kW per rack. This situation forced the company to rework the design of the racks, which led to a significant delay in deliveries and raised the concerns of technology companies such as Google, Meta or Microsoft.
The problem occurs when installing Blackwell GPUs in servers with up to 72 processors. Overheating limits the performance of the chips and can damage the components in the long term. Nvidia is working with vendors to improve cooling, but these modifications have delayed the planned deployment of the processors.
The delay comes after Nvidia also faced other manufacturing complications. The mismatch in the thermal properties of the materials used led to deformations and the need for design changes in the Blackwell chips. The final version of the chips did not go into production until October, with the first deliveries of processors expected at the end of January. Nvidia customers rely on Blackwell GPUs for training large language models and other AI applications. Every delay thus has an impact on their projects and timelines.
Source: pctuning.cz