undefined | Better HN

0 pointsmitjam2y ago0 comments

Here is a queueing api server for self hosted inference backends: https://github.com/aime-team/aime-api-server from a friend of mine. Very light weight and easy to use. You can even serve models from Jupyter Notebooks with it without needing to worry about overwhelming the server. It just gets slower the more load you send to it.

0 comments

1 comments · 1 top-level

_akhe2y ago

Really cool! I like that they have live demos to prove it out. Thanks for sharing

j / k navigate · click thread line to collapse