undefined | Better HN

0 pointsaurareturn2y ago0 comments

For one, they didn't use TensorRT in the test.

Also, stuff like this is hard to take the results seriously:

  * To make an accurate comparison between the systems with different settings of tensor parallelism, we extrapolate throughput for the MI300X by 2.

  * All inference frameworks are configured to use FP16 compute paths. Enabling FP8 compute is left for future work.

They did everything they can to make sure AMD is faster.