Skip to content
Better HN
Top
Best
Ask
Show
New
Jobs
Search
⌘K
0 points
version_five
3y ago
0 comments
Save
Share
Try this, it's for running llms that won't fit in the gpu:
https://github.com/FMInference/FlexGen
0 comments
1 comments · 1 top-level
top
newest
oldest
gpm
3y ago
Currently that looks like it only supports facebook's opt and galactica models. Though they do appear to plan to add support for more models.
j
/
k
navigate · click thread line to collapse