he is saying representing the state is very hard, and you are saying: given a well represented state, ML is very good at finding the important features, reducing the dementionality, and finding mathematical transformations, etc.
deep learning has been so successful with images because representing them is trivial - flattened pixel vector.
with your last paragraph is that in starcraft, that raises some questions on what rules is the AI going to adhere to.
in SC, you don't view the entire board. you view the minimap / hear noises and alerts and decide were to focus your attention on the map. in battle, being able to click and accurately place attacks quickly is important.
Do you give the computer full view of what they would be able to see? does the computer have 10 million clicks per second abilities, essentially every action is like hitting pause and then making the next action?