The 5.6PF you quote for 18 A100's would be in BF16. Not comparable.
The A100 can only do 9.746 TFLOPS in FP64.
So you would need 548 A100's to match the FP64 performance of the Cheyenne.
Double those for raw fp64.
33 of them, which would also have 6,336TB of memory.
I'll have way more than that in my next purchase order.
It is really fun to build a super computer.
But hitting the roofline on those AMD GPGPU's? I'd probably get nowhere fucking close.
That is the thing that Cheyenne was built for. People doing CFD research with x86 code that was already nicely parallelized via OpenMPI or whathaveyou.
I put dual Epyc 9754 into my first box of MI300x.
That's 256 cores + 8x MI300x, in a single box.
Agreed, it is a great solution for CFD, which is definitely one workload I'd love to host.
It doesn't take a lot of employee's, we did the above on essentially two technical people. Those same two are working on this business.
Finding workloads/jobs is definitely going to be an interesting adventure, that said, the need for compute isn't going away. By offering hard to get hardware at reasonable rates and contract lengths, I believe we are in a good position on that front, but time will tell.
We are only buying the best of the best that we can get today. The plan is to continuously cycle out older hardware as well as not pick sides on one over another. This should help us keep pace with other systems.