It's the thing most people even in this thread don't seem to realize has emerged in research in the past year.
Give a Markov chain a lot of text about fishing and it will tell you about fish. Give GPT a lot of text about fishing and it turns out that it will probably learn how to fish.
World model representations are occuring in GPT. And people really need to start realizing there's already published research demonstrating that, as it goes a long way to explaining why the multimodal parts work.
>we are very, very far and this depresses me. What is the way forward? :( Maybe I should just do a startup
and was a founding member of OpenAI just a few years later in 2015