On the other hand, this also feels like a signal that reasoning capability has probably already been plateaued at GPT-4 level and OpenAI knew it so they decided to focus on research that matters to delivering product engineering rather than long-term research to unlock further general (super)intelligence.
I think reasoning ability is not the largest bottleneck for improvement in usefulness right now. Cost is a bigger one IMO.
Running these models as agents is hella expensive, and agents or agent-like recurrent reasoning (like humans do) is the key to improved performance if you look at any type of human intelligence.
Single-shot performance only gets you so far.
For example- If it can write code 90% of the way, and then debug in a loop, it’d be much more performant than any single shot algorithm.
And OpenAI has these huge models in their basement probably. But they might not be much more useful than GPT-4 when used as single-shot. I mean, what could it do what we can’t do today with gpt-4?
It’s agents and recurrent reasoning we need for more usefulness.
At least- That’s my humble opinion as an amateur neuroscientist that plays around with these models.
Because they are dumb so you need to over compute so many things to get anything useful. Smarter models would solve this problem. Making the current model cheaper is like trying to solve Go by scaling up Deep Blue, it doesn't work to just hardcode dumb pieces together, the model needs to get smarter.
OOC, Would this make the academics including algorithms as more or less important in their curriculum? That's a bad win for soceity if it's true.
He did remain silent on when it’s going to be launched.
They’re probably predicting tone of voice tokens. Feed that into an audio transformer along with some speculative decoding to keep latency low.