I've noticed an effect with openai where my codex agents seem to perform worse in the week(s) leading up to a new release. I'm wondering if the vendors tweak effort params at all to free up hardware to host the new version. It's a double win as the new model will look night and day better to their regular users when the new model is released presumably with effort back at normal levels.
Is this a known phenomenon? Are any folks trying to measure any of this objectively?
Was hoping to understand if anyone has done any research on it. I could only really find one research paper on the topic [1], but it seems less focused on this specific issue.
more likely there's some pressure on GPU vs CPU compute resources while the AI company tries to scale out a new model, leading to some type of degradation.