MTr
------------------
H100 SXM5 80,000
MI300X 153,000
H100 NVL 160,000
H100 SXM4 has 52% of the transistors MI300X has, half of the RAM and MI300X achieves *ONLY* 33% higher throughput compared to the H100. MI300X was launched 6 months ago, H100 20 months ago.AMD has work to do.
At best, Apple has Metal API which iOS video games use. I guess there's a level of SIMD-compute expertise here, but it'd take a lot of investment to turn that into a full scale GPU that tangos with Supercomputers. Software is a bit piece of the puzzle for sure, but Metal isn't ready for prime time.
I'd say Apple is ahead of Intel (Intel keeps wasting their time and collapsing their own progress from Xeon Phi / Battlemage / etc. etc. Intel cannot keep investing in its own stuff to reach critical mass). Intel does have OneAPI but given how many times Intel collapses everything and starts over again, I'm not sure how long OneAPI will last.
But Apple vs AMD? AMD 100% understands SIMD compute and has decades worth of investments in it. The only problem with AMD is that they don't have the raw cash to build out their expertise to cover software, so AMD has to rely upon Microsoft (DirectX), Vulkan, or whatever. ROCm may have its warts, but it does represent over a decade of software development too (especially when we consider that ROCm was "Boltzmann", which had several years of use before it came out as ROCm).
-------
AMD ain't perfect. They had a little diversion with C++Amp with Microsoft (and this served as the API for Boltzmann / early ROCm). But the overall path AMD is making at least makes sense, if a bit suboptimal compared to NVidia's huge efforts into CUDA.
I'd definitely rate AMD's efforts above Apple's Metal.
M3 Max's GPU is significantly more efficient in perf/watt than RDNA3, already has better ray tracing performance, and is even faster than a 7900XT desktop GPU in Blender.[0]
[0]https://opendata.blender.org/benchmarks/query/?compute_type=...