Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
Pool spare GPU capacity to run LLMs at larger scale
(opens in new tab)
(github.com)
11 points
i386
3mo ago
3 comments
Save
Share
3 comments
3 comments · 3 top-level
top
newest
oldest
lostmsu
3mo ago
> MoE models via expert sharding with zero cross-node inference traffic
This makes the whole project questionable
vagrantJin
3mo ago
This is very promising, definitely looks more user friendly than exo. Can't wait to try it out.
iwinux
3mo ago
You lost me on "spare GPU". I don't have any capable GPUs, let alone spare ones :)
j
/
k
navigate · click thread line to collapse