undefined | Better HN

0 pointscpburns20092mo ago0 comments

In my experience using llama.cpp (which ollama uses internally) on a Strix Halo, whether ROCm or Vulkan performs better really depends on the model and it's usually within 10%. I have access to an RX 7900 XT I should compare to though.

0 comments

3 comments · 1 top-level

metalliqaz2mo ago· 2 in thread

Perhaps I should just google it, but I'm under the impression that ollama uses llama.cpp internally, not the other way around.

Thanks for that data point I should experiment with ROCm

naasking2mo ago

From what I understand, ROCm is a lot buggier and has some performance regressions on a lot of GPUs in the 7.x series. Vulkan performance for LLMs is apparently not far behind ROCm and is far more stable and predictable at this time.

cpburns2009OP2mo ago

I meant ollama uses llama.cpp internally. Sorry for the confusion.

j / k navigate · click thread line to collapse