Spot instances exist just to try to turn over-provisions in to not a complete loss. You're at least making some money from your mistake.
edit: You should consider "spot instances" in general to be a failure as far as a cloud provider is concerned. It means you've got your guesses wrong. You always want a buffer zone, but not that much of a buffer zone. The biggest single cost for cloud providers is the per-rack OpEx, the cost of powering, cooling etc.
They are constantly guessing at cloud capacity. Short, medium, and long term models with forecasting galore, all under constant recalculation based on customer actions (they literally take live feeds of creation/termination actions), and yes they also take in to account hardware failure and repair rates. Consolidating racks of equipment is a pain in the neck and tends to be avoided, unless you can safely live migrate away all instances.
They all build up various models, using all sorts of forecasting techniques. The longer range forecasts are involved in data center provisioning, along with other business analysis, market research, legal analysis etc. that helps define where future regions should be.
It's still a guess. They can't tell what the actual demand will be, and they can't tell what is going to happen with the supply chain (supply chain issues are the biggest nightmare for capacity planning teams). Sometimes they get it wrong.
The capacity management teams spend a lot of time and expertise to keep the company just sufficiently ahead of demand. It's a crucial part of keeping costs under control.