I agree with your points but this particular point that you mention ...
> A more sophisticated controller would be able to take into
> account the fact there are servers currently starting up,
> and only call for more if those already on the way won't be sufficient.
cant that be modelled into the transfer function characterizing the system. Even linear dynamics are quite capable of modeling sluggish dynamics, so PIDs should be able to handle them fine. If there are strong nonlinearities and the system may start far from a desired set point (to the extent that local linearizations are far too inaccurate) or may venture far out, then yes PID may indeed have trouble.