If everything you want an LLM do is already captured as code or simple skills, you can switch to dumber models which know enough about selecting the appropriate skill for a given user input, and not much else. You would only have to tap into more expensive heavy duty LLMs when you are trying to do something that hasn’t been done before.
Naturally, AI companies with vested interest in making sure you use as many tokens as possible will do everything they can to steer you away from this type of architecture. It’s a cache for LLM reasoning.