Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
lolinder
1y ago
0 comments
Share
Quantized to 4 bits you'll only need ~200GB! 5 4090s should cover it.
0 comments
default
newest
oldest
angoragoats
1y ago
You'll probably need 9 or more. 4090s have 24GB each.
lolinder
OP
1y ago
Oops, I read 48 somewhere but that's wrong. Thanks.
htrp
1y ago
a6ks however
woodson
1y ago
I wonder if AutoAWQ works out of the box, given no architectural changes (?). That would be most straightforward together with vLLM for serving.
downvotetruth
1y ago
If an implementation had NVidia's Heterogeneous Memory Management implemented, then 192 GB RAM DDR5 + GPU VRAM would seem to be close.
pat2man
1y ago
Two 128gb Mac studios networked via thunderbolt 4?
Teknomancer
1y ago
This is actually a promising endeavor. Id love to see someone try that.
angoragoats
1y ago
There's already at least one project that attempts this:
https://github.com/exo-explore/exo
j
/
k
navigate · click thread line to collapse