I’ve been running some LLMs on my 5600x and 5700g cpus, and the performance is… ok but not great. Token generation is about “reading out loud” pace for the 7&13 B models. I also encounter occasional system crashes that I haven’t diagnosed yet, possibly due to high RAM utilization, but also possibly just power/thermal management issues.
A 50% speed boost would probably make the CPU option a lot more viable for home chatbot, just due to how easy it is to make a system with 128gb RAM vs 128gb VRAM.
I personally am going to experiment with the 48gb modules in the not too distant future.