If you don't hit resource constraints some of the time, you're overprovisioned. (Conversely, if you don't have idle capacity some of the time, however briefly, you're underprovisioned.) Overprovisioning can sometimes be the right choice if it isn't too expensive, but ML infrastructure tends to be on the expensive side.
Better to make a profit on small but larger-than-expected volume than a loss on large but smaller-than-expected volume.