Two more papers down the line who knows what Dall-E 4 will be capable of. It is a step in the right direction that the image output is now "stable", which is what this is demonstrating.
But it can't read your mind despite the eerie feeling you get, that is an illusion. Kismet in api form.
The next steps is to open this black box up and actually make its internal pipeline tweak able so it can become a useful tool.
It may end up an amazing super useful tool or a clipart plagiarisor/generator on steroids.
You can't even use it yet and you're already so eager to believe.
I'm simply certain that whatever its capabilities they are short of mind reading. You'd be equally impressed if you asked me to perform a google image search.
That does not mean that Dall-E is unimpressive or the results are fake. What I'm saying is that the hype and mysticism around this is unwarranted.
Elsewhere in the thread somebody else wrote that we are on the cusp of it producing convincing fake footage from the Kennedy assassination from a single text prompt.
The image output now being stable and pleasing to the eye is enough of a result even if it requires trial and error.
You wouldn't lose your mind over a wallpaper generator even though no machine learning is necessary to produce infinite variations of interesting patterns. This thing is spewing out "art" and people are ascribing magical capabilities to it as if it taped a banana to a canvas.
Anything is possible. Maybe Dall-E is capable of even more incredible things. Who knows where this all ends up. Sure. But not quite that much follows from what has been presented so far.