Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
cjbprime
2y ago
0 comments
Save
Share
Wouldn't expect that to work at all.
0 comments
1 comments · 1 top-level
top
newest
oldest
hedgehog
2y ago
Ollama (which wraps llama.cpp) supports splitting a model across devices so you get some acceleration even on models too big to fit entirely in GPU memory.
j
/
k
navigate · click thread line to collapse