I've worked on plenty of teams with relatively small apps, and the difference between:
1. Cloud: "open up the cloud console and start a VM"
2. Owned hardware: "price out a server, order it, find a suitable datacenter, sign a contract, get it racked, etc."
Is quite large.
#1 is 15 minutes for a single team lead.
#2 requires the team to agree on hardware specs, get management approval, finance approval, executives signing contracts. And through all this you don't have anything online yet for... weeks?
If your team or your app is large, this probably all averages out in favor of #2. But small teams often don't have the bandwidth or the budget.
Our AWS account is managed by an SRE team. It’s a 3 day turnaround process to get any resources provisioned, and if you don’t get the exact spec right (you forgot to specify the iops on the volume? Oops) 3 day turnaround. Already started work when you request an adjustment? Better hope as part of your initial request you specified backups correctly or you’re starting again.
The overhead is absolutely enormous, and I actually don’t even have billing access to the AWS account that I’m responsible for.
Now imagine having to deal with procurement to purchase hardware for your needs. 6 months later you have a server. Oh you need a SAN for object storage? There goes another 6 months.
That's an anti-pattern (we call it "the account") in the AWS architecture.
AWS internally just uses multiple accounts, so a team can get their own account with centrally-enforced guardrails. It also greatly simplifies billing.
How many things don’t end up happening because of this? When they need a sliver of resources in the start?
You buy a couple of beastly things with dozens of cores. You can buy twice as much capacity as you actually use and still be well under the cost of cloud VMs. Then it's still VMs and adding one is just as fast. When the load gets above 80% someone goes through the running VMs and decides if it's time to do some house cleaning or it's time to buy another host, but no one is ever waiting on approval because you can use the reserve capacity immediately while sorting it out.
One (the only?) indisputable benefit of cloud is the ability to scale up faster (elasticity), but most companies don’t really need that. And if you do end up needing it after all, then it’s a good problem to have, as they say.
If your load is very spiky, it might make more sense to use cloud. You pay more for the baseline, but if your spikes are big enough it can still be cheaper than provisioning your own hardware to handle the highest loads.
Of course there's also possibly a hybrid approach, you run your own hardware for base load and augment with cloud for spikes. But that's more complicated.
#1: A cloud VM comes with an obligation for someone at the company to maintain it. The cloud does not excuse anyone from doing this.
#2: Sounds like a dysfunctional system. Sure, it may be common, but a medium sized org could easily have some datacenter space and allow any team to rent a server or an instance, or to buy a server and pay some nominal price for the IT team to keep it working. This isn’t actually rocket science.
Sure, keeping a fifteen year old server working safely is a chore, but so is maintaining a fifteen-year-old VM instance!
Having redirected of a vm provider or installing a hyper visor on equipment is another thing.
I worked at a supposedly properly staffed company that had raised 100's of millions in investment, and it was the same thing. VMs running 5 year old distros that hadn't been updated in years. 600 day uptimes, no kernel patches, ancient versions of Postgres, Python 2.7 code everywhere, etc. This wasn't 10 years ago. This was 2 years ago!
But your comparison isn't fair. The difference between running your own hardware and using the cloud (which is perhaps not even the relevant comparison but let's run with it) is the difference between:
1. Open up the cloud console, and
2. You already have the hardware so you just run "virsh" or, more likely, do nothing at all because you own the API so you have already included this in your Ansible or Salt or whatever you use for setting up a server.
Because ordering a new physical box isn't really comparable to starting a new VM, is it?
"Cloud" has changed that by providing an API to do this, thus enabling IaC approach to building combined hardware and software architectures.
Open their management console, press order now, 15 mins later get your server's IP address.
I.e. most hosting providers give you the option for virtual or dedicated hardware. So does Amazon (metal instances).
Like, "cloud" was always an ill-defined term, but in the case of "how do I provision full servers" I think there's no qualitative difference between Amazon and other hosting providers. Quantitative, sure.
But you still get nickel & dimed and pay insane costs, including on bandwidth (which is free in most conventional hosting providers, and overages are 90x cheaper than AWS' costs).
I can't even imagine how much the US Federal Government is charging American taxpayers to pay AWS for hosting there, it has to be astronomical.
I supported an application with a team of about three people for a regional headquarters in the DoD. We had one stack of aging hardware that was racked, on a handshake agreement with another team, in a nearby facility under that other team's control. We had to periodically request physical access for maintenance tasks and the facility routinely lost power, suffered local network outages, etc. So we decided that we needed new hardware and more of it spread across the region to avoid the shaky single-point-of-failure.
That began a three year process of: waiting for budget to be available for the hardware / license / support purchases; pitching PowerPoints to senior management to argue for that budget (and getting updated quotes every time from the vendors); working out agreements with other teams at new facilities to rack the hardware; traveling to those sites to install stuff; and working through the cybersecurity compliance stuff for each site. I left before everything was finished, so I don't know how they ultimately dealt with needing, say, someone to physically reseat a cable in Japan (an international flight away).
You can start with using a cloud only for VMs and only run services on it using IaaS or PaaS. Very serviceable.