Hard to square that with how good open-weights models are getting? I'm doing stuff with Qwen3.5-4b that required a frontier hosted model less than a year ago.
the problem is you're still a year behind with this approach and it isn't at all clear locally hosted models can keep the gap. need more turboquant-like algorithmic boosts for this to happen.