undefined | Better HN

0 pointsversion_five3y ago0 comments

Try this, it's for running llms that won't fit in the gpu: https://github.com/FMInference/FlexGen

0 comments

1 comments · 1 top-level

gpm3y ago

Currently that looks like it only supports facebook's opt and galactica models. Though they do appear to plan to add support for more models.

j / k navigate · click thread line to collapse