The whole point of AWS is to use services on demand; it's like buying 133 conference tickets for your 100 person company.
Honestly this isn’t ultimately engineerings fault. This is a SaaS business Someone in their company is responsible for the COGS KPI. For that person to either not notice an increase in COGS, or to not be aggressively incentivizing engineering to reduce COGS, is giant red flag.
That's a decade-old misconception about how people actually use AWS.
Most servers I've seen in AWS are permanent.
In fact, it's an anti-pattern to wait until you need more capacity to scale up, since those servers may not be available, especially in newer instance families.
Even if the needed instances available, typically ASGs don't react the way you expect without a lot of experimentation (ie. outages.) An example is if traffic increases load, your health check may consider the servers to be unhealthy and start killing them, creating a death spiral.
That being said, it's true that an ALB doesn't offer throttling capability like a true reverse proxy such as HAProxy provides, were you can cap the number of concurrent requests and give a chance to your backend to avoid death by overload.
I wish there would be a way for ASGs to at least make the distinction between an unhealthy instance and an overloaded one.
I see two primary use cases of cloud:
1) You're a startup or just need something small, and want to focus on building your MVP, instead of messing around with colocated Linux servers. Cloud is much more expensive than those, but you don't care because you don't really need all that much, or maybe you are VC backed and have unlimited money.
2) You're a large company with broken internal processes. You can get server in company datacenter in three months after seven approvals (since it's capex), or you can spin up an EC2 instance. You don't care about cost since you have unlimited money.
Those are kind of medium scale "on demand" - not "I need 100 new servers right this minute" but "I need server in ten minutes intead of 'when I get to buy one' or 'in three months and 37 forms'".
In both cases, you're throwing money away, because time is more important for you than the extra money cloud costs.
Just stating that the waste was only 5% of the total savings.