You get into some fundamental signal-processing type issues in terms of how quickly you respond to increases in a given value (incoming requests in this case, but it's a general issue) vs. (in this case) spinning up too many things and overcharging the customer. There's a limit to how reactive Amazon can be here, even in theory. You may have to do some pre-sizing if your needs are that great, and choose to take a possible over-provisioning hit vs. a possible underprovisioning hit. I think it's pretty obvious why Amazon would choose to bias in the underprovisioning direction in this case.
(There's some really good stuff in the signal processing field for anyone responsible for high-scale systems. An underrated branch of math for computer programmers. Believe it or not, the "fundamental limits" I'm referring to are the same ones involved in the Heisenburg Uncertainty Principle, when you get down into it.)