Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
manmal
1mo ago
0 comments
Share
That’s what, 14GB/s? The GPU‘s VRAM can do 100x that.
0 comments
default
newest
oldest
GeekyBear
1mo ago
A discrete consumer GPU card doesn't have enough fast RAM to run a very large model that hasn't been quanitized to hell.
That's why all the projects streaming models into the GPU from an SSD popped up recently.
manmal
OP
1mo ago
Yes. There’s just no way to get above 1t/s that way with a large model.
j
/
k
navigate · click thread line to collapse