Could you elaborate? I hear some people say a big model should be driving a smaller model, I hear some people say a small model should be driving a bigger models.
When I have an expensive task that is clearly defined, I will get opus to write an LLM workflow for it, and then I will execute it with a smaller model. (Starting with the smallest one, and then upgrading if the task fails.)
But this is a single well defined task, designed by me and Opus in concert. If I need ongoing agentic work, Opus would be too expensive. I'm not sure if Haiku is big enough to be the driver yet. And Sonnet is probably too big! Haha.
(Grok looks promising, optics aside... Grok 4 Fast was almost there but not quite. Great for interactive / realtime (steered) work though.)
But I'm thinking you need a smallish model which can delegate both up and down. I'm not exactly sure what that looks like though. Cause the model needs to be big enough to know that it's struggling... Instead of pattern matching to something stupid and getting stuck in a loop trying to solve it the wrong way.