I suppose my comment is reserved more for the documentation than the actual models in the wild?
I do worry that LLM service providers won't do any better than rest API providers in versioning their backend. Even if we specify the model in the call to the API, it feels like it will silently be upgraded behind the scenes. There are so many parameters that could be adjusted to "improve" the experience for users even if the weights don't change.
I prefer to use open weight models when possible. But so many agentic frameworks, like this one (to be fair, I would not expect OpenAI to offer a framework that work local first), treat the local LLM experience as second class, at best.