I think it's also a matter of "shape". Like, GPT4 solves one "shape" of problem, given tokens, predict the next token. That's all it does, that's the only problem it has to solve.
A Civilization AI would have many problem "shapes". What do I research? Where do I build my city, what buildings do I build, how do I move my units, what units do I build, what improvements do I build, when do I declare war, what trade deals do I accept, etc, etc. Each of those is fundamentally different, and you can maybe come up with a scheme to make them all into the same "shape", but then that ends up being harder to train. I would be interested to see a good solution to this problem.