Amazon then charged me one hundred thousand dollars as the server was hit by bot spam. I had them refund the bill (as in how am I going to pay it?) but to this day I've hated Amazon with a passion and if I ever had to use cloud computing I'd use anyone else for that very reason. The entire service with it's horrifically complicated click through dashboard (but you can get a certification! It's so complicated they invented a fake degree for it!) just to confuse the customer into losing money.
I still blame them for missing an opportunity to be good corporate citizens and fight bot spam by using credit cards as auth. But if I go to the grocery store I can use a credit card to swipe, insert, chip or palm read (this is now in fact a thing) to buy a cookie. As opposed to using financial technology for anything useful.
Yes, Amazon, and I assume Azure and Google's cloud and others, "usually" refund the money.
But I don't want to be forced into bankruptcy because my five visitor a week demo project suddenly becomes the target of a DDOS for no reason at all and the hosting company decides this isn't a "usually" so please send the wire transfer.
We can't implement a basic cost limiter policy.
I think we all know why.
There's no need to imply that, it's not illegal to criticise AWS. They do not want anybody to be able to set a limit on spend as that would probably hurt the business model.
It's entirely possible to build cloud first solutions that scale better and are cheaper than your standard reliable colo solutions. But you've got to understand the tradeoffs and when to limit scaling otherwise things can run away from you. I still reach for "cloud first" tools when building my own projects because I know how to run them extremely cheaply without the risk of expenses blowing up because some random thing I've built lands on HN or the equivalent. Many hobby projects or even small businesses can leverage free tiers of cloud services almost indefinitely. But you've got to architect your solutions differently to leverage the advantages and avoid the weaknesses of the cloud. Actually understand the strengths and limitations of the various cloud "functions as a service" offerings and understand where your needs could be solved from those tools and how to work within those cost constraints. Repeatedly I see people trying to use the cloud as if it's just another colo or datacenter and build things in the same way they did before and only think about things in terms of virtual machines tend to have a more difficult time adopting the cloud and they end up spending far more than the companies who can tear down and spin up entire environments through IaC and leverage incremental pricing to your benefit.
https://docs.aws.amazon.com/cost-management/latest/userguide...
What would be helpful, would be if when you set up your account there was a default limit – as in an actual limit, where all projects stop working once you go over it - of some sane amount like $5 or $50 or even $500.
I have a handful of toy projects on AWS and Google cloud. On both I have budgets set up at $1 and $10, with notifications at 10% 50% and 90%. It’s great, but it’s not a limit. I can still get screwed if somehow, my projects become targets, and I don’t see the emails immediately or aren’t able to act on them immediately.
It blows my mind there’s no way I can just say, “there’s no conceivable outcome where I would want to spend more than $10 or more than $100 or whatever so please just cut me off as soon as I get anywhere close to that.”
The only conclusion I can come to is that these services are simply not made for small experimental projects, yet I also don’t know any other way to learn the services except by setting up toy projects, and thus exposing yourself to ruinous liability.
> There can be a delay between when you incur a charge and when you receive a notification from AWS Budgets for the charge. This is due to a delay between when an AWS resource is used and when that resource usage is billed. You might incur additional costs or usage that exceed your budget notification threshold before AWS Budgets can notify you, and your actual costs or usage may continue to increase or decrease after you receive the notification.
This is a reason why I am not only clueless of anything related to cloud infrastructure unless it's stuff I am doing on the job, nor I am willing to build anything on these stacks.
And while I guess I have less than 10 products build with these techs, I am appeal by the overall reliability of the services.
Oh lastly, for Azure, in different European regions you can't instance resources, you need to go through your account representative who asks authorization from the US. So much for now having to deal with infrastructure pain. It's just a joke.
And as others have also mentioned, the reports have a delay. In many cases it’s several hours. But worst case, your CURs (Cost usage reports) don’t really reflect reality for up to 24 hours after the fact.
Is this a perfect solution: no Is this still a solution: yes
You get a warning. There's no service cutoffs or hard limits on spending.
You can ring up tens of thousands+ overnight with AWS. The scale of potential damages is nowhere even close.
Unlike cloud services, your electrical service has a literal circuit breaker. Got a regular three-phase 230V 25A hookup? You are limited to 17.25kW, no way around that. If that shithead neighbor tries to draw 50kW, the breaker will trip.
If it were the cloud, the power company would conveniently come by to upgrade your service instead. A residential home needing a dedicated 175MW high-voltage substation hookup? Sure, why not!
Water leaks, on the other hand, tend to be very noticeable. If a pipe bursts in the attic you'll end up with water literally dripping from the ceiling. It is very rare to end up with a water leak large enough to be expensive, yet small enough to go unnoticed. On the other hand, the cloud will happily let your usage skyrocket - without even bothering to send you an email.
There are plenty of compute service providers working with a fixed cap, a pre-pay system, or usage alerts. The fact that the big cloud providers don't is a deliberate choice: the goal is to make the user pay more than they wanted to.
As always, it just doesn’t make an awful lot of sense to compare physical and virtual worlds. As in leaving your front door unlocked in rural areas vs not securing your remote shell access.
The broken water pipe should be covered by buildings insurance, but I can imagine it not being covered by some policies. Luckily a broken water pipe is likely not as expensive as not having e.g. third party liability protection if part of your roof falls off and hits someone.
For the cloud, I have the good will of the cloud provider and appealing to social media. Not the same thing.
I think one of the reasons I appreciate AWS so much is that any time there has been snafu that led to a huge bill like this they've made it pretty painless to get a refund- just like you experienced.
That’s an insane amount of both money and stress. You’re at Amazon’s mercy if they will or will not refund it. And while this is in process you’re wondering if your entire financial future is ruined.
For a postpaid service with usage-based billing, there are no separate "free" and "paid" plans (= what you're clearly thinking of when you're saying "tiers" here.)
The "free tier" of these services, is a set of per-usage-SKU monthly usage credit bonuses, that are set up in such a way that if you are using reasonable "just testing" amounts of resources, your bill for the month will be credited down to $0.
And yes, this does mean that even when you're paying for some AWS services, you're still benefitting from the "free tier" for any service whose usage isn't exceeding those free-tier limits. That's why it's a [per-SKU usage] tier, rather than a "plan."
If you're familiar with electricity providers telling you that you're about to hit a "step-up rate" for your electricity usage for the month — that's exactly the same type of usage tier system. Except theirs goes [cheap usage] -> [expensive usage], whereas IaaS providers' tiers go [free usage] -> [costed usage].
> Amazon should halt the application when it exceeds quota.
There is no easy way to do this in a distributed system (which is why IaaS services don't even try; and why their billing dashboards are always these weird detached things that surface billing only in monthly statements and coarse-grained charts, with no visibility into the raw usage numbers.)
There's a lot of inherent complexity of converting "usage" into "billable usage." It involves not just muxing usage credit-spend together, but also classifying spend from each system into a SKU [where the appropriate bucket for the same usage can change over time]; and then a lot of lookups into various control-plane systems to figure out whether any bounded or continuous discounts and credits should be applied to each SKU.
And that means that this conversion process can't happen in the services themselves. It needs to be a separate process pushed out to some specific billing system.
Usually, this means that the services that generate billable usage are just asynchronously pushing out "usage-credit spend events" into something like a log or message queue; and then a billing system is, asynchronously, sucking these up and crunching through them to emit/checkpoint "SKU billing events" against an invoice object tied to a billing account.
Due to all of the extra steps involved in this pipeline, the cumulative usage that an IaaS knows about for a given billing account (i.e. can fire a webhook when one of those billing events hits an MQ topic) might be something like 5 minutes out-of-date of the actual incoming usage-credit-spend.
Which means that, by the time any "trigger" to shut down your application because it exceeded a "quota" went through, your application would have already spent 5 minutes more of credits.
And again, for a large, heavily-loaded application — the kind these services are designed around — that extra five minutes of usage could correspond to millions of dollars of extra spend.
Which is, obviously, unacceptable from a customer perspective. No customer would accept a "quota system" that says you're in a free plan, yet charges you, because you accrued an extra 5 minutes of usage beyond the free plan's limits before the quota could "kick in."
But nor would the IaaS itself just be willing to eat that bill for the actual underlying costs of serving that extra 5 minutes of traffic, because that traffic could very well have an underlying cost of "millions of dollars."
So instead they just say "no, we won't implement a data-plane billable-usage-quota feature; if you want it, you can either implement it yourself [since your L7 app can observe its usage 'live' much better than our infra can] or, more idiomatically to our infra, you can ensure that any development project is configured with appropriate sandboxing + other protections to never get into a situation where any resource could exceed its the free-tier-credited usage in the first place."
If you woke up to the auto-withdrew $100k from your bank account and now you need to get it back that's another story.
At minimum they should provide hard billing caps.
On serverless, I can enter numbers in a calculator and guess that running my little toy demo app on AWS will cost between $1 and $100. Getting hit with a huge $1000 bill and a refusal to refund the charges (and revocation of my Prime account and a lifetime ban from AWS and cancellation of any other services I might otherwise run there) would be totally possible, but I have zero control over that. Expecting to go on social media begging for a refund is not a plan, it's evidence of a broken system - kinda like those "heartwarming" posts about poor people starting a GoFundMe so their child can afford cancer treatment. No, that's awful, can we just be sensible instead?
If a server would have cost me $20 at a VPS provider to keep a machine online 24/7 that was at 1% utilization most of the time and was terribly laggy or crashed when it went viral, that's what $20 buys you.
But, you say, analysis of acttual traffic says that serverless would only cost me $10 including scaling for the spike, in which case that's a fantastic deal. Half price! Or maybe it would be $100, 5x the price. I have no way of knowing in advance.
It's just not worth the risk.
Also a vital lesson from the big tech companies that sell a wide variety of services: don't get your cloud hosting from a company that you also use other services from.
I had to disable photo syncing because Google photos eats up my Gmail space. Having Amazon's cloud billing fuckup threaten your TV access is another level.
We clearly need to keep the option open to burn those bridges.
In any case, if I ever host anything, I'm going to host it from my home.
They do this and make it easy to get a refund because for every demo account that does it some bigger account accidentally gets billed 10K and they have to pay for it. They have skin in the game and cannot risk their account being down for any time period.
As I asked before, if what is causing overages is not web requests but storage should they just delete everything?
It's the easiest thing in the world - they just don't want to because they figured that they could use their scale to screw over their customers. And now you have the same guys who screwed everyone over with cloud compute wanting you to pay for AI by using their monopoly position to charge you economic rents. Because now things like edge compute is easy because everyone overspent on hard drives because of crypto. And so you have jerks who just move on to the next thing to use their power to abuse the market rather than build credibility because the market incentivizes building bubbles and bad behavior.
Smart evil people who tell others "no you're just too dumb to 'get it' (oh by the way give me more money before this market collapses)" are the absolute bane of the industry.
It's weird that you have people in here defending the practice as if it's a difficult thing to do. Taxi cabs somehow manage not to charge you thousands of dollars for places you don't drive to but you can't set up an if statement on a server? So you're saying Amazon is run by people that are dumber than a taxi cab company?
Ok, well you might have a point. And this is how Waymo was started. I may or may not be kidding.
They refunded you $100k with few questions asked, and you hate them for it?
I’ve made a few expensive mistakes on AWS that were entirely my fault, and AWS has always refunded me for them.
I imagine if Amazon did implement “shut every down when I exceed my budget” there’d be a bunch of horror stories like “I got DDOSed and AWS shutdown all my EC2s and destroyed the data I accidentally wrote to ephemeral storage.”
They exposed him to 100K of liability without any way to avoid it (other than to avoid AWS entirely), and then happened to blink, in this case, with no guarantee that it would happen again. If you don't happen to have a few hundred thousand liquid, suddenly getting a bill for 100K might well be a life-ruiningly stressful event.
I mean, S3 also incurs ongoing charges, so if you're going to stop accruing charges you'd also be deleting your data that wasn't on ephemeral storage...
And potentially deleting all of your DNS zones (and recreating them will likely give you different nameservers so you'll need to wait for the registrar to update them once you're back)...
And...
Didn't the bootcamp told you to, at least, setup a budget alert?
I'm not trying to reduce AWS' responsibility here, but if a teaching program tells you to use AWS but doesn't teach you how to use it correctly, you should question both AWS and the program's methods.
I feel like this brand of sentiment is everywhere. Folks want things simple. We often figure out what we need to do to get by.
Over time we learn the reason for a handful of the options we initially defaulted through, find cause to use the options. Some intrepid explorers have enough broader context and interest to figure much more out but mostly we just set and forget, remembering only the sting of facing our own ignorance & begrudging the options.
This is why k8s and systemd are have such a loud anti-following.
It does on the surface, but what doesn't make sense is to register with a credit card and not read the terms very carefully: both for the cloud service and for the bank service.
In this aspect cash is so much better because you have only one contract to worry about...
Is it just me or is this just a cheap excuse to grab a payment method from unsuspecting free-tier users?
With that said, AWS is notoriously opaque in terms of "how much will I pay for this service" because they bill so many variable facets of things, and I've never really trusted the free tier myself unless I made sure it wasn't serving the public.
Given the relative accessibility of stolen credit card info, isn't the CC-as-ID requirement easy for a criminal to bypass?
its so easy to get billed a ridicules amount if money
I call bullshit
That would make you one of the most successful websites on the internet, or the target of a DDoS -- which was it? I assume you're not saying that "bots" would randomly hit a single, brand-new "hello world" site enough to generate that kind of bill.
AWS also provides training and education on how to use their services. If launching a "hello world" Elastic Beanstalk instance is so dangerous, why doesn't the tutorial require you to first provide proof that you are an AWS Certified Cloud Practitioner?
An idea I can stand behind. Or do you just let any "self-learner" take care of your banking Oracle or IBM DB2 database...?
> as in how am I going to pay it?
Really?
Amazon charged your card for $100,000 and your bank allowed it?
You're filthy rich by most people's standard, and you were able to pay it.
Amazon was operating in such a good faith that they ate the computational cost you spent. And you hate them for this to this day...
By that logic, any technology that you can get certified in is too complicated?
Most systems are now distributed and presenting a holistic view of how it was designed to work can be useful to prevent simple mistakes.
Traffic requires a certification (license) too. Must be a fake degree as well because they made it too complicated
That is a common view in UX, yes. It's a bit of an extreme view, but it's a useful gut reaction
> Traffic requires a certification (license) too. Must be a fake degree as well because they made it too complicated
In the US roads are designed so that you need as close to no knowledge as possible. You need to know some basic rules like the side of the road you drive on or that red means stop, but there is literal text on common road signs so people don't have to learn road signs. And the driving license is a bit of a joke, especially compared to other Western countries
There is something to be said about interfaces that are more useful for power users and achieve that by being less intuitive for the uninitiated. But especially in enterprise software the more prevalent effect is that spending less time and money on UX directly translates into generating more revenue from training, courses, paid support and certification programs
But infact, it is intended side effects. Things like Jaywalking or "no spitting" laws let police officers harass more people _At their whim_. And they're fullying designed that way but left as "unintended" for the broader public scrutiny.
So, just like, learn that "logic" is not some magic thing you can sprinkle on everything and find some super moral or ethic reality. You have to actually integrate the impact through multiple levels of interaction to see the real problem with "it's just logic bro" response you got here.
It is a fake degree.
In IT, I am inclined to agree with that. In real engineering, it's sometimes necessary, especially dangerous technology and technology that people trust with their life
Software runs on so many things we depend on IMO it also in many cases falls in the "dangerous technology" category.
Non-hobby OSes, non-hobby web browsers, device drivers, software that runs critical infrastructure, software that runs on network equipment, software that handles personal data, --IMHO it would not be unreasonable to require formal qualifications for developers working on any of those.
They get a nice tax write-off.
It's couch-cushion change for them, but it adds up. They have whole armies of beancounters, dreaming this stuff up.
It's also the game behind those "coupons" you get, for outrageously-priced meds that aren't covered by insurance.
If they charge $1,000 for the medication, but give you a "special discount" for $90%, they get to write off $900.
Businesses are only taxed on actual revenue earned.
What you decide to charge—whether $100, $50, or even giving it away for free—is purely a business decision, not a tax one.
—
This is different from a nonprofit donation scenario though. For example, if your service normally costs $X but you choose to provide it for free (or at a discount) as a donation to a non-profit, you can typically write off the difference.
I’ve heard stories like this, many times, from businesses people.
They certainly believe in the pattern.
I don't want to go too far down the rabbit hole of hn speculation, but if another entity owes you 100k, and they go bankrupt, there absolutely are tax implications.
Even if you rig up your own spending watchdog which polls the clouds billing APIs, you're still at the mercy of however long it takes for the cloud to reconcile your spending, which often takes hours or even days.
It’s basic stuff.
You forgot to mention Stanley Tools paid for the hospital bill.
If a tool is designed for experts, but you as the manufacturer or distributor know the tool is used by general populace, you know it's being misused every now and then, you know it harms the user AND YOU KNOW YOU BENEFIT FROM THIS HARM, AND YOU COULD EASILY AVOID IT - that sounds like something you could go to jail for.
I think if Amazon was a Polish company, it would be forced by UOKiK (Office of Competition and Consumer Protection) to send money to every client harmed this way. I actually got ~$150 this way once. I know in USA the law is much less protective, it surprises me Americans aren't much more careful as a result when it comes to e.g. reading the terms of service.
https://medium.com/@maciej.pocwierz/how-an-empty-s3-bucket-c...
About how you make unauth’d API calls to an s3 bucket you don’t own to run up the costs. That was a new one for me.
Agreed about that. I was hired onto a team that inherited a large AWS Lambda backend and the opacity of the underlying platform (which is the value proposition of serverless!) has made it very painful when the going gets tough and you find bugs in your system down close to that layer (in our case, intermittent socket hangups trying to connect to the secrets extension). And since your local testing rig looks almost nothing like the deployed environment...
I have some toy stuff at home running on Google Cloud Functions and it works fine (and scale-to-zero is pretty handy for hiding in the free tier). But I struggle to imagine a scenario in a professional setting where I wouldn't prefer to just put an HTTP server/queue consumer in a container on ECS.
And does some of their suggested solutions actually work or not...
There are some workloads that are suitable for lambda but they are very rare compared to the # of people who just shove REST APIs on lambda "in case they need to scale."
If you can't run locally, productivity drops like a rock. Each "cloud deploy" wastes tons of time.
It’s still not perfect because the code is running locally but it allows “instant” updates after you make local changes and it’s the best I’ve found.
We then ended up deleting the S3 bucket entirely, as that appeared to be the only way to get rid of the charges, only for AWS to come back to use a few weeks later telling us there are charges for an S3 bucket we previously owned. After explaining to them (again) that this way our only option to get rid of the charges, we never heard back.
It is amazing, isn't it? Something starts as an oversight but by the time it reaches down to customer support, it becomes an edict from above as it is "expected behavior".
> AWS was kind enough to cancel my S3 bill. However, they emphasized that this was done as an exception.
The stench of this bovine excrement is so strong that it transcends space time somehow.
That's the best part!
The devs probably never thought of it, the support people who were complained to were probably either unable to reach the devs, or time crunched enough to not be able to, and who as a project manager would want to say they told their Devs to fix an issue that will lose the company money!
Anyone wanna guess which open source tool this was? I'm curious to know why they never detected this themselves. I'd like to avoid this software if possible as the developers seem very incompetent.
What are the odds?
(Not a rhetorical question. I don't know how the choice of names works.)
Customers demand frictionless tools for automatically spinning up a bunch of real-world hardware. If you put this in the hands of inexperienced people, they will mess up and end up with huge bills, and you take a reputational hit for demanding thousands of dollars from the little guy. If you decide to vet potential customers ahead of time to make sure they're not so incompetent, then you get a reputation as a gatekeeper with no respect for the little guy who's just trying to hustle and build.
I always enjoy playing at the boundaries in these thought experiments. If I run up a surprise $10k bill, how do we determine what I "really should owe" in some cosmic sense? Does it matter if I misconfigured something? What if my code was really bad, and I could have accomplished the same things with 10% of the spend?
Does it matter who the provider is, or should that not matter to the customer in terms of making things right? For example, do you get to demand payment on my $10k surprise bill because you are a small team selling me a PDF generation API, even if you would ask AWS to waive your own $10k mistake?
At AWS I’d consistently have customers who’d architected horrendously who wanted us to cover their 7/8 figure “losses” when something worked entirely as advertised.
Small businesses often don’t know what they want, other than not being responsible for their mistakes.
Having said that, within AWS there are the concepts of "budget" and "budget action" whereby you can modify an IAM role to deny costly actions. When I was doing AWS consulting, I had a customer who was concerned about Bedrock costs, and it was trivial to set this up with Terraform. The biggest PITA is that it takes like 48-72 hours for all the prerequisites to be available (cost data, cost allocation tags, and an actual budget each can take 24 hours)
Imagine the horror stories on Hacker News that would generate.
Set it up so that machines are deleted, but EBS volumes remain. S3 bucket is locked-out but data is safe.
One of those things is more important to different types of business. In some situations, any downtime at all is worth thousands per hour. In others, the service staying online is only worth hundreds of dollars a week.
So yes, the solution is as simple as giving the user hard spend caps that they can configure. I'd also set the default limits low for new accounts with a giant, obnoxious, flashing red popover that you cannot dismiss until you configure your limits.
However, this would generate less profit for Amazon et al. They have certainly run this calculation and decided they'd earn more money from careless businesses than they'd gain in goodwill. And we all know that goodwill has zero value to companies at FAANG scale. There's absolutely no chance that they haven't considered this. It's partially implemented and an incredibly obvious solution that everyone has been begging for since cloud computing became a thing. The only reason they haven't implemented this is purely greed and malice.
Maybe rather than completely stopping the service, it'd be better to rate limit the service when approaching/reaching the cap.
If your business suddenly starts generating Tbs of traffic (that is not a ddos), you'd be thrilled to pay overage fees because your business just took off.
You don't usually get $10k bandwidth fees because your misconfigured service consumes too much CPU.
And besides that, for most of these cases, a small business can host on-prem with zero bandwidth fees of any type, ever. If you can get by with a gigabit uplink, you have nothing to worry about. And if you're at the scale where AWS overages are a real problem, you almost certainly don't need more than you can get with a surplus server and a regular business grade fiber link.
This is very much not an all-or-nothing situation. There is a vast segment of industry that absolutely does not need anything more than a server in a closet wired to the internet connection your office already has. My last job paid $100/mo for an AWS instance to host a GitLab server for a team of 20. We could have gotten by with a junk laptop shoved in a corner and got the exact same performance and experience. It once borked itself after an update and railed the CPU for a week, which cost us a bunch of money. Would never have been an issue on-prem. Even if we got DDoSed or somehow stuck saturating the uplink, our added cost would be zero. Hell, the building was even solar powered, so we wouldn't have even paid for the extra 40W of power or the air conditioning.
I worked for a small venture-funded "cloud-first" company and our AWS bill was a sawtooth waveform. Every month the bill would creep up by a thousand bucks or so, until it hit $20k at which point the COO would notice and then it would be all hands on deck until we got the bill under $10k or so. Rinse and repeat but over a few years I'm sure we wasted more money than many of the examples on serverlesshorrors.com, just a few $k at a time instead of one lump.
Those mechanisms would lead to a large reduction in their "engineering" staff and the loss of potential future bragging rights in how modern and "cloud-native" their infrastructure is, so nobody wants to implement them.
Sure they're probably VMs but their cost isn't 0 either
And in a lot of cases it's hard to find out if a production application can be switched off. Since the cost is typically small for an unused application, I don't know if there are many people willing to risk being wrong
Pardon my ignorance, but isn’t that something that can happen to anyone? Uncached objects are not something as serious as leaving port 22 open with a weak password (or is it?). Also, aren’t S3 resources (like images) public so that anyone can hit them any times they want?
I'm glad I use a Hetzner VPS. I pay about EUR 5 monthly, and never have to worry about unexpected bills.
The trade-off being that your site falls over with some amount of traffic. That's not a criticism, that may be what you want to happen – I'd rather my personal site on a £5 VPS fell over than charged me £££.
But that's not what many businesses will want, it would be very bad to lose traffic right at your peak. This was a driver for a migration to cloud hosting at my last company, we had a few instances of doing a marketing push and then having the site slow down because we couldn't scale up new machines quickly enough (1-12 month commitment depending on spec, 2 working day lead time). We could quantify the lost revenue and it was worth paying twice the price for cloud to have that quick scaling.
Buckets are used for backups, user uploads, and lots of things other than distributing files publicly.
The real question is if they considered caching and thus configured it appropriately. If you don't, you're telling everyone you want every request to go to origin
And it's getting harder and harder to make them public because of people misconfiguring them and then going public against AWS when they discover the bill.
It's not that hard to configure access controls, they're probably cutting corners on other areas as well. I wouldn't trust anything this person is responsible for.
You just shouldn't be using S3 to serve files directly. You can run most public and many private uses through CloudFront. Which gives you additional protections and reduces things like per object fetch costs.
> you hit natural rate limits
Seen by your customers or the public as a "denial of service." Which may actually be fine for the people who truly do want to limit their spending to less than $100/month.
with AWS, you wake up to a 6 figures bill.
To me, "serverless" is when the end user downloads the software, and thereafter does not require an Internet connection to use it. Or at the very least, if the software uses an Internet connection, it's not to send data to a specific place, under the developer's control, for the purpose of making the software system function as advertised.
With "Serverless", your code is in a "function as a service" model where all you have to worry about is the business logic (your code). You don't have to set up the server, you don't have to install the server OS, or any basic server software that is needed to support the business logic code (http server, etc). You don't have to update the server or the underlying server software. You don't have to perform any maintenance to keep the server running smoothly. You never (typically) have to worry about your server going down. All you have to do is upload your business logic function "somewhere" and then your code runs when called. Essentially you do not have to deal with any of the hassle that comes with setting up and maintaining your own "server", all you have to do is write the code that is your business logic.
That's why it's called "Serverless" because you don't have to deal with any of the hassle that comes with running an actual "server".
Also known as "shared hosting". It's been done since the 90's (your folder full of PHP files is an NFS mount on multiple Apache servers), just that the techbros managed to rebrand it and make it trendy.
The serverless function has higher-order features included as part of the package: you get an automatic runtime (just as with PHP but in this case it can be golang or dotnet), the function gets a unique endpoint URL, it can be triggered by events in other cloud services, you get execution logging (and basic alerting), multiple functions can be chained together (either with events or as a state machine), the function's compute can be automatically scaled up depending on the traffic, etc.
Think of it as: What do I have to do, in order to scale up the conpute of this URL? For hardware it's a call to DELL to order parts, for VMs or containers it's a matter of scaling up that runtime, or adding more instances - neither of those processes are simple to automate. One key characteristic of the function is that it will scale horizontally basically however much you want (not fully true, aws has a limit of 1500 instances/second iirc, but that's pretty massive), and it will do it automatically and without the request sources ever noticing.
Functions are also dirt cheap for low/burst traffic, and deployment is almost as easy as in the PHP FTP example. Personally I also think they are easier to test than traditional apps, due to their stateless nature and limited logical size (one endpoint). The main downsides are cost for sustained load, and latency for cold starts.
With that said, they are not "endgame". Just a tool - a great one for the right job.
More generally, I don't like that a term ending with "-less" marks an increase in system complexity.
This is great if you are willing to completely change your client-server code to work efficiently in this environment. It is a strain over a standard design and you should only be using it when you truly need what "serverless" provides.
They don't understand what I mean by that. That's okay, they'll learn!
Anyway, this kind of thing comes up regularly on Hacker News, so let's just short-circuit some of the conversations:
"You can set a budget!" -- that's just a warning.
"You should watch the billing data more closely!" -- it is delayed up to 48 hours or even longer on most cloud services. It is especially slow on the ones that tend to be hit the hardest during a DDoS, like CDN services.
"You can set up a lambda/function/trigger to stop your services" -- sure, for each individual service, separately, because the "stop" APIs are different, if they exist at all. Did I mention the 48 hour delay?
"You can get a refund!" -- sometimes, with no hard and fast rules about when this applies except for out of the goodness of some anonymous support person's heart.
"Lots of business services can have unlimited bills" -- not like this where buying what you thought was "an icecream cone" can turn into a firehouse of gelato costing $1,000 per minute because your kid cried and said he wanted more.
"It would be impossible for <cloud company> to put guardrails like that on their services!" -- they do exactly that, but only when it's their money at risk. When they could have unlimited expenses with no upside, then suddenly, magically, they find a way. E.g.: See the Azure Visual Studio Subscriber accounts, which have actual hard limits.
"Why would you want your cloud provider to stop your business? What if you suddenly go viral! That's the last thing you'd want!" -- who said anything about a business? What if it's just training? What if your website is just marketing with a no "profit per view" in any direct sense?
Creating a new word for a more specific category is never Orwellian. The project in 1984 was to create a language which was less expressive. They were destroying words describing fine distinctions and replacing them with words that elided those distinctions. Creating a new word to highlight a distinction is the opposite.
There's definitely criticisms to be made of the term serverless and how it obscures the role of servers, but Orwellian is not the correct category. Maybe we could say such services run on servelets to describe how they're "lighter" in some sense but still servers.
Quote from the book:
“The Ministry of Peace concerns itself with war, the Ministry of Truth with lies, the Ministry of Love with torture and the Ministry of Plenty with starvation. These contradictions are not accidental, nor do they result from ordinary hypocrisy: they are deliberate exercises in doublethink.”
Serverless being in fact server-based seems like a pretty clear example of this, and so calling it an Orwellian term seems perfectly reasonable.
"Here's some code, make sure it runs once an hour, I don't care where."
There becomes a point where being mad that the specific flavor of PaaS termed serverless achtually has severs is just finding a thing to be mad at.
It doesn't just "have" servers; they aren't a hidden implementation detail. Connecting to a website is an instrumental part of using the software.
If you are in the niche of IT, servers, HTTP operations etc, I can see why the name would make sense, because in that domain, you are always working with servers, so the name describes an abstraction where their technical details are hidden.
I used 1TB of traffic on a micro instance and it cost me $150 (iirc). Doesn't have to be this way.
At least stick a rate limited product in front of it to control the bleed. (And check whether the rate limit product is in itself pay per use...GCP looking at you)
That's just a problem waiting to happen while you are always running tests on production...
I have a makefile system which controls lambda deployments. One step of the deployment is to gather the security requirements and to build a custom IAM role for each individual lambda. Then I can just write my security requirements in a JSON file and they're automatically set and managed for me.
The real joy of AWS is that everything works through the same API system. So it's easy to programmatically create things like IAM roles like this.
2. Use the officially supported docker runtime for local testing.
3. Treat it like any other code and make unit tests
4. Use one of the tools like localstack to emulate your staging env on your machine.
There are so many options that I don’t know how you could walk away with your opinion.
> Basically free staging environment. [emphasis mine]
Not really. Sure, the cost would usually be peanuts... until you have an infinite loop that recursively calls more lambdas. Then you have a huge bill (but hey that pays for your invites to their conferences, so maybe it's a blessing in disguise?). And yes, you will pretty much always get it refunded, but it's still a hassle and something that is absolutely not necessary.
Snark aside, having an opaque dev environment always constrained by bandwidth and latency that can’t be trivially backed up/duplicated is a terrible idea and why I always recommend against “serverless”, even besides the cost concerns.
Serverless is OK for small, fully self contained pieces or code that are fire and forget. But for anything complex that’s likely to require maintenance, no thanks.
Deploying a stack to your own developer environment works fine and is well worth doing, but the turnaround time is still painful compared to running a normal web framework project locally. Deploying a stack takes much much longer than restarting a local server.
Serverless isn't all bad, it has some nice advantages for scaling a project, but running and debugging a project locally is a definite weak spot.
Does it really happen to really have to pay such a bill? Do you need to tweet about it to be reimbursed?
This is what scares me, is social media the only way to get things sorted out nowadays? What if I don't have a large following nor an account in the first place, do I have to stomach the bill?
Encouraged by comments on HN over the years I had them ask support to kindly to wave it. After repeating the request a few times they eventually reduced their bill to <100€ but refused to wave it entirely.
So even without shaming on social media, But it probably depends. It's worth at least asking.
^The deal changed about six months ago.
However these projects are measured in ways that make Oracle licenses rounding errors.
Which naturally creates market segmentation on who gets tier 1 treatment and everyone else.
but what happen if this happen to corporate account and somewhere resource get leaked???
multi billions dollar company probably just shrug it off as opex and call it a day
it would be 2x more expensive and halve developer speed. also we would lose some internal metric systems honed over 20yr.
ceo told to go ahead anyway (turn out company was being sold to Apollo)
first thing we did was a way to bootstrap accounts into aws so we could have spend limits from day one.
can't imagine how companies miss that step.
If you get a dedi on a 10Gb/s guaranteed port and it works out to more than $3 / TB, you're probably getting scammed. How does "serverless" justify 150x that? Are people hosting some silly projects really dense enough to fall for that kind of pricing?
Just get a $10 VPS somewhere or throw stuff on GH pages. Your video game wiki/technical documentation/blog will be fine on there and - with some competent setup - still be ready for 10k concurrent users you'll never have.
Like setting a maximum budget for a certain service (EC2, Aurora?) because downtime is preferable to this?
I host demos, not running a business, so it's less of an issue to get interrupted. Better an interruption than a $50,000 bill for forgetting to shut off a test database from last Wednesday.
If it can bill them per-invocation, why can't it also check against a budget? I don't expect it to be synchronous, but a lag of minutes to respond is still better than nothing. Can you even opt-in to shutting down services from the budget tool, or is that still something you have to script by hand from Cloudwatch alarms?
I think figuring out how to do this faster is less trivial than it might sound. I agree that synchronous checks aren’t reasonable. But let’s take Lambdas. They can run for 15 minutes, and if you consolidate within five minutes after a resource has been billed, that gives you a twenty minute lag.
I’m not trying to make apologies for Amazon, mind you. Just saying that this isn’t exactly easy at scale, either. Sure, they bill by invocation, but that’s far from synchronous, too. In fact, getting alerts might very well be happening at the frequency of billing reconciliation, which might be an entirely reasonable thing to do. You could then argue that that process should happen more frequently, at Amazon’s cost.
So, in other words, the vendor has provided substandard tooling with the explicit intent of forcing you to spend more money.
Experience AWS for up to 6 months without cost or commitment
Receive up to $200 USD in credits
Includes free usage of select services
No charges incurred unless you switch to the Paid Plan
Workloads scale beyond credit thresholds
Access to all AWS services and features
Plus the VPS is just so much faster in most cases.
"What if your app went viral and you woke to a $20k cloud bill? $50k? $80k?"
If the answer is anything less than "Hell yeah, we'll throw it on a credit card and hit up investors with a growth chart" then I suggest a basic vps setup with a fixed cost that simply stops responding instead.
There is such a thing as getting killed by success and while it's possible to negotiate with AWS or Google to reduce a surprise bill, there's no guarantee and it's a lot to throw on a startup's already overwhelming plate.
The cloud made scaling easier in ways, but a simple vps is so wildly overpowered compared to 15 years ago, a lot of startups can go far with a handful of digitalocean droplets.
That billing strategy makes it impossible to prevent cost overruns because by the time the system knows your account exceeded the budget you set, the system has already given out $20k worth of gigabyte-seconds of RAM to serve requests.
I think most other serverless providers work the same way. In practice, you would prevent such high traffic spikes with rate limiting in your AWS API Gateway or equivalent to limit the amount of cost you could accumulate in the time it takes you to receive a notification and decide on a course of action.
Really, we(they) in the company decided to move in cloud everything from on-prem, it should save costs say them.
But, as result you anyway need DevOps, some complications with development, local environments and not only.
For not short career I faced some good examples, but it's more about unique situations, not a rule and a lot companies continue pay a lot for some small bunch of utility.
Maybe I'm wrong, but such topics about this hell heard a lot of timesand only on some conference: success stories (because they should say: success)
Still, it made me question why I'm not using a VPS.
When Vercel switched everything to serverless, it all became pretty terrible. You need 3rd party services for simple things like DB connection pooling, websockets, cron jobs, simple queue, etc because those things aren’t compatible with serverless. Not to mention cold starts. Just a few weeks ago, I tried to build an API on Next.js+Vercel and get random timeouts due to cold start issues.
Vercel made it easier to build and deploy static websites. But really, why are you using Next.js for static websites? Wordpress works fine. Anything works fine. Serverless makes it drastically harder to build a full app with a back end.
Because when everything is a bunch of SaaS Lego bricks, serverless is all one needs for integration logic, and some backend like logic.
Add to it that many SaaS vendors in CMS and ecommerce space have special partner deals with Vercel and Nelify.
I told them that was a mistake and they forgot the debit, they just asked to no do again.
All these stories of bill forgiveness reminds me of survivorship bias. Does this happens to everyone that reaches out to support or just the ones that get enough traction on social media? I am pretty sure there is no official policy from AWS, GCP or Azure.
Single day Firebase bill for $100k - https://news.ycombinator.com/item?id=43884892 - May 2025 (14 comments)
Serverless Horrors - https://news.ycombinator.com/item?id=39532754 - Feb 2024 (169 comments)
https://www.troyhunt.com/closer-to-the-edge-hyperscaling-hav...
The amount of brainwashing that big cloud providers have done, is insane.
To be fair, support was excellent both times and they waived the bills after I explained the situation.
I'm old enough to remember when cloud was pitched as a big cost saving move. I knew it was bullshit then. Told you so.
This issue is serverless-specific. If I pay $20/month on VPN the most frightening thing that can happen is the client calling you about your website being down, not a $100k bill.
If we're building anything bigger than a random script that does a small unit of work, never go for serverless. A company I recently worked for went with Serverless claiming that it would be less maintenance and overhead.
It absolutely was the worst thing I've ever seen at work. Our application state belonged at different places, we had to deal with many workarounds for simple things like error monitoring, logging, caching etc. Since there was no specific instance running our production code there was no visibility into our actual app configuration in production as well. Small and trivial things that you do in a minute in a platform like Ruby on Rails or Django would take hours if not days to achieve within this so-called blistering serverless setup.
On top of it, we had to go with DB providers like NeonDb and suffer from a massive latency. Add cold starts on top of this and the entire thing was a massive shitshow. Our idiot of a PM kept insisting that we keep serverless despite having all these problems. It was so painful and stupid overall.
Chances are, the company was fishing for (or at least wouldn't mind) VC investment, which requires things being built a certain (complex and expensive) way like the top "startups" that recently got lots of VC funding.
Chances are, the company wanted an invite to a cloud provider's conference so they could brag about their (self-inflicted) problems and attract visibility (potentially translates to investment - see previous point).
Chances are, a lot of their engineering staff wanted certain resume points to potentially be able to work at such startups in the future.
Chances are, the company wanted some stories about how they're modern and "cloud-native" and how they're solving complex (self-inflicted) problems so they can post it on their engineering blog to attract talent (see previous point).
And so on.
It's kind of amazing, though. I keep getting pressure from the non-techs in my organization to "Migrate to the Cloud." When I ask "Why?" -crickets.
Industry jargon has a lot of power. Seems to suck the juice right out of people's brains (and the money right out of their wallets).
At the end of the day though the whole think feels like a carpenter shooting themselves in the foot with a nail gun then insisting that hammers are the only way to do things.
If you didn't sit down with the documentation, the pricing guide, and a calculator before you decided to build something then you share a significant portion of the fault.
I would be embarrassed to put my name on these posts admitting I can't handle my configs while blaming everyone but myself.
Serverless isn't a horror, serverlesshorrors poster. You are the horror. You suck at architecting efficient & secure systems using this technology, you suck at handling cloud spend, and you suck at taking responsibility when your "bug" causes a 10,000x discrepancy between your expected cost and your actual bill.
Just because you don't understand it doesn't mean it sucks
I'm more worried about the overconfident SRE that doesn't stay up at night worrying about these.
Or do you always log in as root, like a real man, relying purely on your experience and competence to avoid fat-finger mistakes?
Golly if only the configuration wasn't made this way on purpose exactly to cause this exact problem.
It reminds me of the Citi(?) employee who typed the wrong decimal place in a trade: computers make everything so easy!
Have the people posting these horror stories never heard of billing alerts?
If you have bot spam, how do you actually think their billing alerts work? The alert is updated every 100ms and shuts off your server immediately? That isn't how billing alerts can or should work.
Do any of you people have budgets, or do you all rely on the unending flow of VC money?
The majority of these massive bills are due to traffic, there is pretty much no way that AWS could stop your server in time...if they had the choice, which they don't.
I think my original point was unclear: I am pointing out that if you just think about how this stuff can possibly work, billing alerts can not work in the way you expect. The alert is updated async, the horse has bolted and you are trying to shut the gate.
I don't use AWS for personal stuff because I know their billing alerts won't stop me spending a lot. Don't use them if that is a concern.
I do use AWS at work, we are a relatively big customer and it is still very expensive for what it is. The actual hardware is wildly overpriced, their services aren't particularly scalable (for us), and you are basically paying all that overage for network...which isn't completely faultless either. Imo, using them in a personal capacity is a poor idea.