I use 24 sticks of ddr5-4800, which gets me up to 9t/s on deepseek 2.5 at q8. 48 threads was optimal in llama.cpp. I would like to move to epyc 9005 chips and ddr5-6000, but it is cost prohibitive with CPUs still over $10k each on eBay.
I followed the guide at https://rentry.co/miqumaxx/