>It's really important to understand that ALL THE MODEL KNOWS is a mapping of [pixels, input] -> new pixels. It has zero knowledge of game state.
This is false. What occurs in inside the model is unknown. It arranges pixel input and produces pixel output as if it actually understands game state. Like LLMs we don't actually fully understand what's going on internally. You can't assume that models don't "understand" things just because the high level training methodology only includes pixel input and output.
>The only "state" that is known is the last few frames of the game screen. Because of this, it's simply not possible for the game model to know if an enemy should be shown as dead or alive once it has been off-screen for longer than those few frames. It also means that if you keeping turning away and towards an enemy, it could teleport around. Once it's off the screen for those few frames, the model will have forgotten about it.
This is true. But then one could say it knows game state for up to a few frames. That's different from saying the model ONLY knows pixel input and pixel output. Very different.
There are other tricks for long term memory storage as well. Think Radar. Radar will capture the state of the enemy beyond just visual frames so the model won't forget an enemy was behind them.
Game state can also be encoded into some frame pixels at the bottom lines. The Model can pick up on these associations.
edit: someone mentioned that the game state lasts past a few frames.
>If you're trying to make a new game, then you need new frames to train the model on.
Right so for a generative model you would instead of training the model on one game you would train it on multitudes of games. The model would then based off of a seed number output a new type of game.
Alternatively you could have a model generate a model.
All of what I'm saying is of course speculative. As I said, this model is a stepping stone for the future. Just like the LLM which is only trivially helpful now, the LLM can be a stepping stone for replacing programmers all together.