Blackwell has become too hot-headed, Nvidia points to the racks

In certain configurations, Nvidia’s brand new AI accelerator processors codenamed Blackwell are prone to overheating, writes the The Information. According to the newspaper, the problem has caused serious concern among customers who want to put the servers with new chips into operation as soon as possible, the commissioning of which is already delayed due to the processors arriving late.

According to the experience of the partners so far, the Blackwells cook in the racks that receive the server configurations containing 72 chips, presumably because of the ventilation, which is difficult to cope with the high heat development. Nvidia has repeatedly asked its suppliers to change the design of the racks, according to the company, similar consultation and joint engineering work with large cloud providers is practically everyday.

AWS’ domestic online meetup series continues with CI/CD!

In the fifth station of the series on December 12, we will present the AWS CodeCatalyst platform and the open source Dagger.

AWS’ domestic online meetup series continues with CI/CD!
In the fifth station of the series on December 12, we will present the AWS CodeCatalyst platform and the open source Dagger.

The Blackwell series chips announced in March, the pair B200/GB200, depending on the nature of the operations – at least on paper – have at least seven times and up to thirty times more computing power than their immediate predecessor, the H100/GH100, while being able to operate more efficiently.

The Blackwell chips – which were named after the mathematician David Harold Blackwell – theoretically have 20 petaflops of FP4 computing capacity, which is five times the performance of the H100, and the B200, which is actually “glued together” from two chips, consists of a total of 208 billion transistors, compared to the H100 , which is made up of 80 billion transistors.

In August, the news broke that the mass delivery of the Blackwell series processors could be delayed for a few months due to a design error, after the chip or a part of it had to be redesigned. It is not clear whether the problems at that time have or had anything to do with excessive heat development during operation, but Nvidia has repeatedly indicated that mass production and the delivery of the first manufactured chips will not be significantly delayed.

Source: www.hwsw.hu