There is also (or at least used to be?) an OpenAI compatible API layer for Ollama so that may be an option as well, though my understanding is there are some downsides to using that.
Note: This comment and the link are just meant as references/conveniences, not intended as a request for free labor. Thanks for opening up the code!