For example Stack Overflow used to handle all their traffic from 9 on-prem servers (not sure if this is still the case). Millions of daily users. Power consumption and hardware cost is completely insignificant in this case.
LLM inference pricing is mostly driven by power consumption and hardware cost (which also takes a lot of power/heat to manufacture).
They just finished their migration to the cloud, unracked their servers a few weeks ago https://stackoverflow.blog/2025/07/16/the-great-unracking-sa...
The infrastructure and hardware costs are seriously more costly than typical internet apps and storage.