undefined | Better HN

0 pointsbick_nyers1y ago0 comments

Would be interesting to see the performance on a dual-socket EPYC system with DDR5 running at maximum speed.

Assuming NUMA doesn't give you headaches (which it will) you would be looking at nearly 1 TB/s

0 comments

tpm1y ago

But you need cpus with the highest number of chiplets because the memory controller to chiplet interconnect is the (memory bandwidth) limiting factor there. And those are of course the most expensive ones. And then it's still much slower than gpus for llm inference, but at least you have enough memory.

j / k navigate · click thread line to collapse

0 comments

tpm1y ago

j / k navigate · click thread line to collapse