Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
Q8 KV cache lets a 30B model fit 100K context on a 24 GB RTX 5090 | Better HN
Q8 KV cache lets a 30B model fit 100K context on a 24 GB RTX 5090
(opens in new tab)
(buraak.com)
2 points
bozdemir
4d ago
0 comments
Share
0 comments
No comments yet.