Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
hughw
9d ago
0 comments
Save
Share
Just this morning I tweaked my single 3090 setup too:
OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 OLLAMA_CONTEXT_LENGTH=180000
and that fits in 23GB.
[edited for format]
0 comments
2 comments · 2 top-level
top
newest
oldest
hasteg
3d ago
Just a heads up (you may already know this) -- In the local LLM community there's a pretty strong disdain for Ollama specifically due to numerous reasons provided in this blog post:
https://sleepingrobots.com/dreams/stop-using-ollama/
MaKey
9d ago
Friends don't let friends use Ollama:
https://sleepingrobots.com/dreams/stop-using-ollama/
j
/
k
navigate · click thread line to collapse