No, dear author. Setting up the AWS billing alarm was the smartest thing you ever did. It probably saved you tens of thousands of dollars (or at least the headache associated with fighting Amazon over the bill).
Developers make mistakes. It's part of the job. It's not unusual or bad in any way. A bad developer is one who denies that fact and fails to prepare for it. A great developer is one like the author.
So I guess the question is, with a mistake like this, is it better to be charged hundreds or thousands of dollars, or to have your service degrade or go offline until you can fix it?
If it is the latter you do not want any rate limiting, you want everything to scale as fast as possible (I hope there are no bugs on your end). Rate limiting means that your new customers get a poor experience and so they are more likely to ask for a refund, or not renew next time.
The fundamental issue here is serverless is great at allowing you to automatically scale to meet demand, but it also is great at automatically scaling to meet unexpected resource usage caused by errors (or poor design). And so this means a mistake on your end can cost you a lot of money, because the system thought that it was real demand.
We dynamically create and instantiate new servers based on load and if it's sustained for a while. Once it's up, it's added to the load balancer. Once the load of them goes down, it's spin down after it's spent some time idle (it costs to instantiate so might as well keep outside of the queue for a bit before completely removing it).
This all runs automatically. If we don't limit it, it's on us.
How is this not a problem with how he managed it?
> This is probably the most stupid thing I ever did. One missing return; ended up costing me $206.
He clearly mentioned it's his error there.
If the degraded or offline system is used by people, and these people cannot work, the cost can be a lot higher. For example, 10 people not able to work could cost something in the range of $250-$750 per hour.
Moreover, if customers are lost due to this degradation of service and CAC is high, then clearly the cheapest thing is a high bill by AWS, which probably is also capped by Amazon (and handled as an alert by Amazon).
And even 100% code coverage doesn't find all possible errors.
Unit tests are specifically useful for refactors. You can refactor your code and ensure that it behaves as intended. Integration tests are great, too, don't get me wrong. Either or both would have probably caught this.
So yeah, let's blame the developer, but let's not play like mistakes don't happen and they're not costly in the "serverless" world.
It’s easy to burn tens or hundreds of thousands ‘accidentally’ on “server”, easier than on serverless.
If you’re spending real money, you should have an account team. Talk to them if such a problem happens.
The colloquialism for "real money", at least to me, is "a substantial sum". If that's what you intended, wouldn't it make sense that you wouldn't have an account team if the only time you spent real money was by accident?
Its not a "problem" with server/serverless of course, but no-scaling-by-default vs unlimited-scaling-by-default (which is imo the better way to split the server/serverless topic), one is going to cost more when things get thrown for a loop
Without autoscaling, you would just have a queue that grows until the machine runs out of disk space. Either way, this was a problem with code and not event based scaling.
Never use a pay-per-use service that does not include a reasonable "turn off after $X" feature and appropriate warnings. Also, never use such services without being sure to configure such settings.
I like to think of this as a self-inflicted "DDOC" attack: Distributed Denial of Capital.
Best not to leave yourself exposed.
When you have customers doing events, it’s more often that the scale up is from a real event than that someone fat fingered a config.
If they are broadcasting an unscheduled Obama speech from home page of a major paper, that’s not the time to go “Oh, anomalous, shut it down.” By the time that gets fixed and back on, Obama’s left the building - and your customer leaves too.
If you are in the business of offering a service with “elasticity” as a core capability, we found it better for SLOs and better for the bottom line to ‘fix’ this after the fact by discussion than to attempt to tell real spikes from glitches.
If you don’t want elasticity, you might not be looking for “cloud”.
and a SDK like boto3:
http://boto3.readthedocs.io/en/latest/reference/services/bud...
I can't imagine this changing.
None of the "Cloud providers" offer that. They "claim" that it could impact service - yeah, service of debt that you owe them.
Unless I have hard guarantees, I give "cloud providers" re-loadable cards. Can't take more money than what's on there.
Even a few bytes sitting on S3 continue to incur charges and it's hard to be real-time with spend tracking at the scale of these providers so the only option they have is to delete your entire account immediately. Is that what you want? Who would?
For most companies, business continuity matters. The proper solution is to use the budget and reporting features to check your work.
With a regular server you could go viral, your server dies, so you lose also without a bound in lost business/good will/whatever.
Also need to take into account the time/effort spent on making the regular server scale, albeit this is also a relatively flat rate.
people still play with fire. limit your losses, go with digital ocean or something for 5$/mo flat no matter what.
I think the author meant to do this less of a "play with fire" way but more of experimenting with new tech way. But yes, I agree that for personal sites, running with your own money, you probably want to stick with something safer like the $5/mo digital ocean box.
I'm surprised people still talk about cloud services as being cheaper esp where developers are free to use what they want.
Idea: use API Gateway to configure a quota to match your budget projections. That will force a hard stop. Would be nice if AWS made this easier.
My main challenge with serverless is using Lambda with API Gateway. Lambda has no database connection pooling, so I end up with a ridiculous number of connections to RDS - one for each simultaneous user. I haven't found a solution to this yet, other than not using API Gateway.
Besides, we aren't really talking about production databases at large companies. The people who want caps are devs learning and experimenting. It could come with dislaimers that if you enable a cap and exceed it that your services will go offline unexpectedly, and that may leave databases in inconsistent states. But for a large number of usage scenerios that is a completely acceptable tradeoff.
The simple fact is, not having a cap certainly puts me off experimenting with a service due to a fear of a mistake causing a big bill. And developers learning and investigating a technology is what preceeds them recommending that technology to their companies.
Last time I looked Azure allows a zero spend cap on free accounts, but you can't change the amount to anything else, and once you remove it you can't switch the cap back on. Thats limited, but it's perfect for a learning environment.
If Azure can implement a zero spend cap, there is absolutly no reason that either AWS or Azure can't implement an x spend cap in exactly the same way.
All a developer needs to do immediately after adding a credit card to AWS/Azure/GCP would be to create an IAM role with permission to automatically add and track fine-grained billing alarms and notify via email/sms for any potential billing overages.
I think a $60/yr service like this would be useful to protect against future events of bill shock.
https://github.com/Teevity/ice https://billgist.com/ http://cloudcheckr.com/
a) go into credit (so they will charge you at end of month)
b) disable services
Maybe AWS/Google also support a hard limit on spending.
It's gotten much, much easier, and is just another form of command line management, similar to the CLI framework tools with your preferred stack.
Once that first setup is done, similar to setting up a serverless environment, you are generally restoring backups of your base image and beginning projects from there.
It also immensely helps to learn about how to build something to scale that isn't completely reliant on the PaaS layer.
It's nice not to have to worry about a server, but I feel like there are just as many little things to futz with in serverless architectures especially before "environment" variables existed in Lambda.
I have migrated all my services to GCE. At least GCE provides free decent quotas for every resource.
Qualify your statements.
The OP said I should run my own infrastructure. I -could- host my blog by running a web server atop a server I administer, sure. I'd have to take on all the infrastructural tasks of doing that, securing it, ensuring any availability/scalability concerns I may have are taken care of, etc, but I -could- do that.
Instead, S3 + Cloudfront (or, sure, any flavor of hosting and edge caching options you care for; I was not implying "Just AWS") means I don't have to worry about any of that. For me, the reduced level of control, increased availability, scalability, and easy "it just works", is worth the tradeoff. As is the pennies per month it costs me given the low utilization and pay-as-you-go model. It's hardly a scam.
This is the takeaway quote from this for me.
...until you receive the $206 bill for the work done by the server.
But if you can ignore that, you can probably also ignore the fact that your code runs on a server.