Meanwhile, when I build something for a client my first go-to is something like DigitalOcean or Amazon LightSail. I know, LightSail is still AWS. But it's not going to automatically increase the bill by 300% because someone accidentally toggled something in AWS's byzantine admin console.
When ("if" is more often the case) that client starts to see the kinds of traffic that necessitates a full-scale AWS build out then I'll advise them of their options. By that point, they're actually generating real revenue and, as importantly, they didn't pay me to prematurely optimize a product that wasn't yet generating any revenue, which helped them succeed.
It's the freedom to get things wrong, iterate and try again. It's the freedom to move resources around. It's the freedom of not having to wait for your procurement process and approval from a purchasing department. It's the freedom of not having to wait for your operations team to plug in and configure bare metal. And all of these freedoms have compounding interest.
For most organizations, what I just talked about equates to man years of work wasted and I think a lot of people nitpicking about cloud being expensive have lost sight of this.
Sure it costs more, but it's from money you've already saved. You'll have loads of saved money left to spend on more developers too.
I'm talking about new projects at much smaller companies, or much smaller departments within the company that don't yet know if the product they're building is going to even work.
Not to mention; I have the same freedoms to try things out with a $10/mo DigitalOcean droplet as I do EC2/Lambda/ELB/RDS/S3/etc... If something doesn't work, delete the droplet and start over. To that degree, it's even easier, cheaper, and more free to just test something out in Docker containers on my laptop.
So, respectfully, I'm not buying the "freedom" argument. That's not even the selling point AWS is pitching.
AWS or any cloud service is anything but freedom. So please use the word freedom judiciously. If you are working with AWS your application and your whole organization is at the whims of Amazon. You will not even be able to migrate if you are tied into those API's and services because they are not same across cloud providers. You don't even own the data or compute resource hosted on such cloud services because just by a single notice from a government agency your account will be blocked without access to your own data or compute resources. It has happened and will continue to happen because for government approaching single cloud provider is much easier than going over the process of getting it for company or an individual (more costly and difficult given different jurisdiction across country and state lines).
In terms of technical freedom you will have less of it and will be tied into proprietary extensions, API and services offered by those cloud providers.
On the other hand for self-hosted infrastructure government notice will not block access to resources owned by you and data in those resources. You can respond to notice and continue using those resources until court rules. So it is more freedom than a cloud can offer.
In our organization we need to go extra mile and put extra engineering efforts to make sure our apps can work on different clouds with different services (e.g. using apache libcloud, avoiding proprietary API's).
Being locked in to AWS (or GCP, or..) when costs are rising out of control is what I consider the opposite of freedom.
I was with one startup that went full-in on AWS features, so it was very difficult to migrate. Sure, we got started real quick, that was nice. I argued early on that we need to be careful of not getting locked in to AWS in case we need to jump away once we have some customer growth, but everyone else thought that was silly.
Fast forward two years and our AWS bill was about $2500 per year per customer. Revenue per customer, certainly nowhere near that. Burning VC money gets you far but only so far. Yes we ran out of money.
On other hand, when we were using simple VPSes at various companies, at some companies, I could just send email saying I provisioned a new server and at the others, they wanted a ticket with justification for it. But we always had enough test servers to try out and experiment. Cloud is great in many ways but freedom has more to do with an organization.
With bare metal it's 100% predictable. Where to book the sudden spike in costs in April? Why did it happen? We had €3000 booked for this project in April, why do we have to pay €5000? And so on.
Sure, if you have too much money to burn and the financial department is lax with controlling the costs, AWS feels great, but not everybody has this luxury.
You can get things wrong, iterate and try again with basically anything.
Assuming that everything else other than cloud is automatically bad and full of bureaucracy does not inspire anyone.
I don't understand your point. How is any random cloud provider, or even hosting company, a blocker to moving resources around or wait for procurement or to iterate?
I'm looking at the Hetzner cloud dashboard and I'm free to launch how many instances I need at the drop of a hat.
In the end there's an itemized bill, which unlike AWS you pay only for what you ordered up front and you can reason about without having to endure advance courses or freaking machine learning tools.
Where exactly are you seeing any freedom at all on AWS?
We've since moved (almost) everything over to postgres and merged a lot of the microservice together. And while we're still running on heroku, we could probably run on Raspberry Pi if we needed to.
I am far from a database expert, but the fact that I have been able to implement without much effort efficient systems that have high bandwidth and reliability requirements, balancing and low latency speaks eons of the current state of affairs.
If you're starting a startup, use postgres for your database. it will handle damn near anything and you can spawn off functionality into microservices as they become bottle necks as you scale.
Hell, postgres lets you write database extensions that perform novel behavior. I saw one the other day that automatically syncs data to elastic search. Wouldn't be that hard to envision having writes to certain tables sync to firebase if you need subscriptions on records for certain parts of your app.
This is by far the most important lesson I have learned this decade. Anyone can pull the rug out on iteration 1 and start a completely new attempt in a 2nd iteration bucket.
It takes determined engineering efforts and talent to iterate on top of the existing code base while not impacting the ability to deliver ongoing feature support to production.
We have historically tried to do complete rewrites with bombastic claims like "We can do it right this time". Well, we did it "right" 3 times until we decided that restarting from zero every time is a bad way to make forward progress. I will advocate for a clean restart if you are just beginning or it is clear you have totally missed the domain model target. If you have been in production for 3 years, maybe consider a more conservative iteration approach.
The best analogy I can come up with is the Ship of Theseus thought experiment. Rebuilding the ship from the inside while you are sailing it to the new world will get you there much faster than if you have to restart in Spain every time someone is in a bad mood or doesn't like how the sails on the boat are arranged.
Does it have a lot of compute power? No.
Is it dirt-simple to admin, and have enough horsepower to do the simple things like run a QuickBooks server on a Windows VM, Fileshare on another, test on a third, etc that is all they usually need? Absolutely!
It isn't uncommon at all that I make my initial IT inventory at a new client's business and find that they are spending hundreds to thousands/month unnecessarily in overblown costs, and AWS is a primary driver of that. A few kilobucks in to owned hardware, and the problem is essentially solved for a few years, and paid for in a few months. Everybody wins.
Often it's not though.
It's a pain to admin for someone who knows nothing about it, and it's also a pain to have to do anything at all!
The power of the cloud for smaller shops is that the don't have to know what Synology is, or even give it a thought.
That you click point-click and set up an EC2 with some S3 and get rid of most admin knowledge/know-how is like magic.
That's the best reason to use it, even it costs 1-3x as much.
Once there is A) stability and predictability in requirements and B) sufficient scale - then - we can hire smart people such as yourself who know better, to set up something relatively simple, that will be relatively low maintenance (i.e. whatever you bill) and save some $.
Think Development vs. Operations cost.
If in dev, we have to wait for, mess with, configure things, it's very expensive, delays etc. and we don't want that.
Once there's enough predictability to determine 'unit cost' on the services side ... then you can 'cut costs' by moving to a slow-changing, custom solution, that is hopefully 'simple', as you say.
I look at compute instance providers by how they charge and for what project.
Sometimes I need more RAM, sometimes I need a faster CPU, sometimes I need more storage space accessed as a normal filesystem. Sometimes I need all of the above.
So it might be DigitalOcean, it might be Linode, it might be some fly-by-night bare metal provider.
Bang on. The capex costs of BYO IT setup was the bugbear(real one FWIW) that AWS and other cloud providers were trying to avoid. But if a lot of folks are paying lots of money upfront preparing for "scale" that might never happen, they aren't really saving capex are they?
Cause when I think AWS, I think of using Lambda, Fargate, DynamoDB, RDS, SQS, SNS, S3, Kinesis, Redshift, Elastic search, etc.
All these managed services arn't just about cost or scale, a lot of it is about convenience and development speed and flexibility, as well as reduced maintainance and operational burden.
I can't understand why people keep using this as an argument against anything else other than cloud while playing catch-up with the rain of changes and suffering from fomo of last week's new stuff from cloud.
Heck, when did infrastructure became like js landscape?
Here's the thing though. For most startups, 99% of the value comes from the long right-tail of that 1% chance of explosive viral growth out of left-field.
You're right, most companies aren't ever going to need that kind of overnight scalability. But most of those startups will wind up failing anyway. At the end of the day, everyone's betting that they'll be the exception that makes it big. Without that possibility it's not worth showing up at the table.
And if it does come, you need to be ready to move fast. Like, "we hit the front-page of Reddit and we have 5 minutes to scale up our traffic by 5000%" fast. If you take your time to collect some revenue and circle back, the ship has likely sailed.
I'm not saying thinking this way is sensible for every org. But it's definitely sensible for at least some business models. In particular within the hyper-growth SV startup scene. Foregoing scalability because most startups don't go viral, would be like an aspiring actor foregoing headshots because most auditions don't get callbacks.
I'd argue this mentality is exactly why so many startups fail, and fail hard. They over-optimize for being one of the "lucky few" and under-optimize for actual sustainability.
Someone will snarkily mention that Amazon services go down from time to time across an availability zone or even a whole region, and presumably this never happens on prem (I guess the idea is that Amazon engineers are well-below average?) simply because you rarely hear about it when $MomAndPopCRUDShop goes down. On that note, in my experience, customers are really sympathetic when an Amazon outage brings you down (because half of the Internet is also down) but not very sympathetic when it’s just your site that’s down.
My IT team is not going to break something on the most important day of the month for my business. Sometimes even when I want to do something mundane on the network, it's "eh, let's wait until not (insert key date here), just in case". Amazon does not care what day is important to my business, they're Amazon and they do what they want.
You are going to pay IT staff either way, if it's your own hires (or an MSP), or you are paying via AWS fees. If you're going to pay for IT, why wouldn't you pay for IT that takes orders from you and cares about your business?
The other thing is, Amazon's maintenance and upgrades is based around Amazon's need to remain competitive and turn a profit, and to support businesses other than yours which may have larger needs. (The rollout of a new feature that causes an outage for you might not have any benefit to you anyways.)
Similarly, a single patched Exchange server running on a single VM on a VM host with a UPS backup on it is generally speaking, more reliable and has better uptime than Office 365. Hilariously, Office 365 also costs a lot more.
Heh, let me tell you about the time a truck drove into the sidewall of one of our datacenters and took out _both_ power sources to the facility...
Seems like a waste to pay for a month of VM time when only a few milliseconds are actually used. I like the idea of FaaS where I'm not being charged when my code isn't running.
You need to "scale" early in a cloud environment because you are allocated a fraction of the hardware and someone else on that same hardware is using all the resources. AWS/DO etc boost their margins by packing as many vps's onto one physical box as they can. That's why on a VPS your 95th percentile response time is 1000x your 50% response time. All you can do is "scale" and pray enough of your connections are hitting the boxes that nobody else is using right now.
AWS has great reliability and a complete ecosystem of plug-and-play offerings that just work. And it's well documented. It sure is more expensive, but it's easier to achieve the same result. And faster. And keep in mind, the dev isn't footing the AWS bill. And isn't getting paid a penny extra for trimming down said AWS bill.
Of course, if you add stocks to the compensation structure or a bonus based on cloud spending, that changes the incentives.
Colocation isn't that much better either because there are still situations (like corrupt filesystems or malfunctioning power supplies) that would require physical access to the machine.
With an IaaS provider you're free to go live wherever you want and focus on your business and not need to schedule shifts of people within driving distance of the colocation facility on-call.
Ultimately you're paying to reclaim time, which in almost all cases for startups is a worthwhile trade. Outsource everything that isn't your core business.
This is so often repeated, but at my previous company, the amount of internal outages was smaller than the amount of outages on AWS or GCP. We then migrated to AWS for some stupid reason (we didn't need scalability, that's for sure) and we had a problem with redis nodes: the instances they were running on were too small and an upgrade went wrong. We waited until the next shift for the fix. The irony.
I disagree.
You still get all those problems with a shared provider, recall Ghandi's outage?, DigitalOcean, Linnode's and AWS a week ago. Your also stuck in a tighter situation when it does occur.
With years of colocation I have never encountered a power-out event within the datacentre. That's a potential seven years of uptime if I hadn't changed providers in-between.
Nor have I ever suffered a corrupt filesystems and that includes myself preforming DR monthly via sudden power-loss cold-boots. Sure I accidentally deleted /etc/ once.
The only issue out-standing is a failed RAM stick however it's a poorly server which I'm retiring in the new year. Poor thing.
People have been using them in collocations for years. Collocations also provide hardware swap services. They can replace a power supply for you.
If that's the case, then you are probably right that most will never need to scale. You can build them a solution that can't scale and it'll be the correct solution for the 90% that fail. And it'll be the wrong solution for the 10% that succeed.
In other words; I can build them something for $10,000 that gets them to market in a month that may not be what they need in five years or I can build them something for $100,000 that gets them to market in a year, meanwhile, their scrappy competitor launched 11 months before them.
You go for AWS because in half a day you can whip up a secure, managed and auto scaling app using their proprietary services like SQS, S3, Lambda, Kinesis, Dynamo etc.
They all start out starry-eyed thinking they're the next unicorn and oh-my-gawd what if we start getting the kind of traffic Facebook gets tomorrow?! That's just not a reality. Facebook themselves didn't get that traffic for several years.
It's absolutely true that colo is cheaper than AWS for projects that can absorb the labour and failure related costs, but that usually is only true for hobby or very small business services.
Everything is built up and torn down via Terraform, and ZERO maintenance tasks are done via the console. Changes to infrastructure configuration are managed via CAB which approves PRs on Github which are automatically deployed via TF cloud.
We do monthly disaster simulations where we replace a _production_ environment.
I'll agree that AWS paints a very rosy picture of their own services, but there is substance underneath that PR.
They do not. We run many thousands of instances on AWS. Sure, given our scale, every week we get a couple of instance retirement emails. Usually they are issued many days in advance(sometimes, weeks). Per year, we may get a handful of instances that are suddenly unresponsive.
And we don't care. You know why? Because it's just a matter of issuing stop/start. Done! Server is back up, potentially even in a different datacenter, but it is none the wiser. It looks like a reboot. Even better, add an auto-recovery alert and AWS will do this for you, automatically. If part of an ASG, add health checks.
For the most part, we don't even notice when instances go down. Our workloads are engineered to be fault-tolerant. If a meteor destroys one AWS datacenter, it might temporarily take out some instances. So what? New ones will be back very shortly, all the while databases will fail-over, etc.
If these were physical instances, someone would have to do the maintenance work, purchase orders, wait for hardware to arrive, and so on and so forth. And, for most "co-location" scenarios, if your datacenter has issues, everything will go down. A single AZ in AWS has multiple datacenters, you might not even be affected if one goes up in flames.
But let's say you run a massive pet server farm and none of then can go down for any period of time. You have given names and everything, and you celebrate their birthdays. Cool. Run that on GCP then. They do auto-migration. I've never seen an instance go down.
> Do you understand that with AWS you still have to setup instances, manage security updates, permission and deal with disaster response
This is true. However, if you are running your own hardware, you have to do that IN ADDITION TO dealing with hardware and datacenter shenanigans, with either a specialized (and expensive) workforce, or a barely capable one that's shoehorned and doing double duty, with zero economies of scale, probably in a single data-center.
The advantage of AWS is that in theory you can script your infrastructure. Deploy to multiple zones, add hosts, remove hosts, deploy artifact, etc. Makes all kinds of things very easy. Dedicated SQL / Cassandra / DynamoDB offerings make that a few clicks to get going.
This saves people costs, but just so happens to be more expensive in terms of hardware costs.
You can replicate all of this in your own DC or whatever. You do need to invest in tooling, hardware, etc. which is worth it at the highest scales. Or worth it if you don't need everything and thus don't need to pay for it.
Personally, I think many large organizations can optimize by running with a mix of on-prem and cloud services.
Demand may be elastic but not totally elastic, so your base load could be cheaper in house. Storage is cheaper on premise for huge datasets.
You can afford enough staff to rack servers, but you aren't dominated by the lack of servers.
You buy more EC2 instances to scale your business, letting the product team deliver the product, and deploy on-prem to squeeze out more efficiency behind them. Yes, you probably don't use the full offerings of a cloud provider, but that also means less lock in at a cost of a little lower convenience and more staff.
In terms of management burden, I will take the cloud infrastructure a thousand times out of ten.
Automation work is easily an order of magnitude of less effort in the cloud than on-prem. Labor dollars spent go exponentially further.
Setting up instances, managing updates, adding IAM perms is orders of magnitude faster/easier than dealing with rack-and-stack data centers.
Full downtime isn't that frequent, either. The Kinesis outage on Wednesday didn't affect any of our USE1 functionality, with the exception of maybe Cloudfront propagation.
"The cloud is just someone else's computer" - I used to say that with contempt, and now I say it with understanding.
We’ve had more issues with the collocation data center, like removing one of the good drives in a degraded raid-array and extremely expensive bandwidth.
I could hire four system administrators for that money to look after each of my four dedicated servers full time.
On top of that you first need to learn AWS. That thing is complicated - but I can set up a dedicated server in minutes because it's indistinguishable from my development environment.
I joke that dedicated servers are only expensive if your time and money are worthless. Because AWS is going to eat both.
Quote :
We are going to say we used four hours of labor. This includes drive time to the primary data center. Since it is far away (18-20 minute drive), we actually did not go there for several quarters. So over 32 months, we had budgeted $640 for remote hands. We effectively either paid $160/ hr or paid less than that even rounding up to four hours.
There is no way in the world that the only effort taken related to infrastructure is entirely encapsulated in ~ $200 per year.
You think bad things don't happen on AWS? You think that an experienced admin isn't going to spend significantly more time trying to communicate with an actual human at Amazon than fixing a problem on a physically controlled server?
This is incredibly naive.
I've never had AWS tell us that they can't find the server we're renting that suffered HDD failure. AWS has never cut us off from the Internet because a sysop typed a command in the wrong session.
And we find AWS support to be far quicker to respond, and far more effective, than our previous providers.
So, YMMV, but AWS is pretty damn good for our requirements.
It's a sliding scale. What makes you money? What's your expected growth? What are your security needs? What experience does your team have?
In this thread, people at each end of the scale are 1. conflating different use cases (wordpress is not the same as a massive, scaled data ingest pipeline) and 2. are ignoring the hidden costs of both.
Is it only expensive if developer time is free or if it is, let's say, a quarter of a typical Bay Area senior engineer?
Salary levels are vary quite a lot across the world, even looking only at developed countries. At what $/hr level does managing your infrastructure become more cost-effective?
Which accounts for almost all hosting and startups. Most startups will never see more than a handful of users and yet microservices, lamdba, nosql etc must be deployed because ‘scaling and failover’, wasting their own and investers’ money for nothing. Only a small % will ever need any scaling and failover.
If aws were truly cheaper, we would be using it for everything; currently we mix which is far cheaper. Anything not fail critical that benefits from fixed price cpu/gpu, bandwidth and storage we host traditionally. We and our clients saved fortunes over the years. And for clients or projects that do not need such heavy lifting as aws, we simply host traditionally completely.
But ... most web applications are still IO bound, and labour is surprisingly expensive.
Running something on the open Internet is super annoying. If you properly subnet, or restrict to VPN, I can see it as useful but then you have a whole other host of problems to conquer.
If you're doing a static page or such, no issue, but the second you're hosting a webapp or need to have a database, yeah.
For sake, just spin up a single server somewhere cheaper and run it all in there and vertically scale up after you've put the appropriate monitoring to know when you need to scale up.
And put your money where it's worth it.
I want my team focused on building stuff that makes us money.
I've mentioned before[0] that AWS too expensive and overkill for Wordpress sites (especially simpler non-ecommerce ones). I don't think this conclusion is controversial. Simpler hosting requirements is why Patrick only has to spend 4 hours of labor in 1 year to upgrade some hardware.
It's when you need the higher-level value-added services of AWS services (Dynamo, Redshift, region failover, etc) that the comparison becomes more complicated.
E.g. Companies with mission-critical transactional websites or mobile backends are more complex and they need agility to add/change the infrastructure landscape in response to unknown workloads. They don't have the money (or expertise) to code an in-house version of AWS services portfolio. E.g.[1]
[0] https://news.ycombinator.com/item?id=10797166
[1] https://www.cbronline.com/news/guardian-aws-migration
older archive: https://web.archive.org/web/20160319022029/https://www.compu...
And again, this is a fraction of what we have in data cetners due to the labs and such.
Add simply static or a similar plugin, whip up a script to upload the static HTML to S3, set up Cloudfront with good cache, and you're done. Your site is faster, more secure, cheaper and can easily run on a t3.micro for peanuts.
Only gotcha is comments, but it's a solved problem ( disqus or any of the alternatives) and similar.
It would be much more interesting if the comparison was between AWS and dedicated servers. Dedicated servers have all the benefits of colocation, only you don't need to deal with the hardware. The DC takes care of the hardware, all you have to do it administer the server.
I find dedicated servers to be very cost effective. You can get off the shelf servers in minutes and if you need custom hardware that's only a quote away. Most providers will offer month-to-month billing, so there is no lock-in.
Often dedicated servers are even cheaper than colocation when you add up all the costs of hardware, racking, sparing, financing, etc.
I run a file hosting site that serves >500TB of data to about 2 million monthly users on a bunch of dedicated servers that cost me about $500 a month.
If I ran this on AWS it would cost me more than $20,000 every month.
I could hire four system administrators to do nothing but look after each of my four dedicated servers for that money.
It's completely ridiculous.
Equivalent EC2 is $300-600/month with no better reliability. Plus you have to deal with noisy neighbors and bandwidth overages. Build your baseline usage in colo and burst into AWS unless you like buying Bezos more houses.
Pretty sure EC2 instances and EBS volumes have a lot more redundency than a single server. You really need two colocated servers to replace a single EC2 instance. Still probably cheaper but also a larger time investment.
If the difference between AWS vs colocation is an additional FTE then AWS is cheaper.
"""We are going to say we used four hours of labor. This includes drive time to the primary data center. Since it is far away (18-20 minute drive), we actually did not go there for several quarters. So over 32 months, we had budgeted $640 for remote hands. We effectively either paid $160/ hr or paid less than that even rounding up to four hours. """
Software and systems configuration wise you aren't really going to be doing much different timewise then you would by doing loop de loops with AWS configuration stuff anyway. Tftp etc is just not that tough.
lol
I don't think I've ever had a non-dodgy packet arrive from Psychz network.
Where's the most affordable to buy an off-lease server?
I currently have a stock scanning project that I'd like to take online where I'd need 128gb RAM+, 2TB+ drive space, and 20-60 cores (more the better).
SQS/SNS, DynamoDB, ECS (EC2 and Fargate), Lambda, Kinesis, ECR, ALB, Aurora, S3, Glacier, EMR, EKS, multiple AZs, multiple regions, etc?
For us to get the same functionality as all of these services in colocated data centers would be an INSANE amount of work, we'd have to hire so many IT/hardware specialists for the networking, let alone hiring data storage tech specialists for stuff like Spark/Hadoop.
Looking only at costs for instances is only 10% of the story here.
Queueing and messaging systems, databases, key-value stores, backup solutions etc. have all been invented pre-AWS, and there are battle-tested solutions for all of that out there, actively being used by companies which did not choose to depend on AWS.
If I move this out of AWS, I need that data replicated for redundancy. I need some meaty servers set up, and I need to maintain the boxes that run the storage and database software. I need to build out the observability software that I get for almost free with Cloudwatch. My database instances need to failover appropriately, which doesn't happen automatically. I need to monitor disk usage and add capacity as the databases grow. I need to set up and manage whatever firewall setup I otherwise get for free with VPCs and security groups.
Managing my own email infrastructure is almost laughable. Spending more than about two and a half minutes on it each month outweighs the benefit of just running it on AWS.
My time isn't free. I either need to hire someone to build this, I need to learn it and apply it correctly, or find some magic docker stuff that does it for me.
Where is the cost savings of moving out of the cloud for me? If I'm spending all of my time in a SSH session installing kernel updates or diagnosing why my storage cluster is misbehaving, when am I supposed to actually build my product?
Even without getting into multiple availability zones, things like SQS/SNS, Dynamo, S3, and Lamda combine to make things very easy on day one with AWS.
Could it be cheaper? Well, yeah, I'm sure. And if all you're doing is running EC2 instances, go somewhere else. But also, maybe look into some of the other stuff AWS provides!
While one of the benefits of things like AWS and GCP is the availability of all kinds of services, but at the same time we are to some extent loosing control. I have personally had several cases where I was thinking "This thing would be so useful" but AWS/GCP/Pick your cloud did not have it and it would have been a hassle to jam that one piece of supporting software together with the core services, so it never got taken into use.
It's in "how easily can we script and template all of our services and associated resources". It's in "when we need a highly durable and available queue, how much work/infra do we need".
When you factor all that stuff in, no way it comes out cheaper to self host here. Just the couple full time employees we need to setup and maintain all the ancillary stuff would cost us more than our AWS bill.
SQS is just such an amazing tool in a developer's toolkit that is one of those transformational pieces to how you write software. Combine that with Lambda to consume messages, S3 as your object storage, and Dynamo as your state management, and you unlock the capability of a single engineer to write applications that would have taken teams and teams of people only a few years ago.
Again, does this mean anything to the Wordpress crowd? Probably not. But the ability to bring a new project to market quickly and grow that project is no longer bottlenecked by how fast you can get on your phone to your colo provider on Saturday night.
How many different vendors and licensing schemes would do you need to bring to your own data center to get an equivalent amount of functionality going? (I ask rhetorically; this is the equivalent of a few FTEs at the least).
If you are treating cloud services as purely an apples-to-apples cost comparison then you've missed the point of deploying to (insert your favorite cloud provider here). What you get with cloud services is flexibility and speed. If the OP wants to undertake a new project that they have no idea of the resulting workload or popularity, they need to guess about the required underlying infrastructure. If you miss on your guess, you can either kill momentum for that new project, or you can wind up overprovisioning and paying way way too much on hardware.
I'm not sure how quickly their colo provider could spin up new hardware for this hypothetical new project, but you can assume it's not the dynamism that you get from deploying to an elastic cloud environment. Again, you're paying a premium on day-to-day costs in order to have this freedom to create and deploy. That's what cloud infrastructure is about.
To give an example -- I have a very-CPU-heavy workload that started its life out on prem. As more and more customers signed up, I would go through tiers of adding more hardware to a rack, where my overall margins looked like a sawtooth waveform when I would provision more hardware. I started out moving to EC2 and then finally to ECS/Fargate. If I want to spin up a new piece of functionality for my users, I don't need to provision any new hardware, set out any real new infrastructure, or do anything that you would consider prework in order to get that new functionality deployed. My margins are much much more predictable and I get faster time-to-market on feature development. That's what you're paying the premium for, that ability to just move faster.
AWS is like co-locating your hardware, and then the data center having 1000s of employees offering highly reliable services you can access from your infrastructure that lets you move WAY faster for things that are hard to do.
e.g. Discussion today was increasing log retention. 10 years ago I would have run the numbers, extended some SAN volumes, considered procuring more NetApp shelves, etc.
Today it's simply a cost question: is it worth it to us to store those logs for 10x longer? Sure? Ok, done.
Your small team can _comfortably manage_ 50k/mo worth of AWS resources. That's _INCREDIBLE_.
I think people mistakenly think that cloud costs only go up exponentially and that costs are an unmanageable mess.
I've worked with teams with hundreds of engineers serving a major enterprise and only an AWS bill 3-5x yours. And they only had a small team managing it all. Comfortably.
To do similar with physical servers requires a massive stack of people and salaries. The ongoing recruiting costs to maintain staffing would dwarf what the AWS bill is.
No no they wouldn't. Number of people is equal for both, its just different tasks they have to do.
Also, you'll never get into a situation on AWS where you have a 3 day lead-time for a replacement GBIC, or NVMe drive, or seomthing, while your customers scream and bail out for a competitor that isn't down.
Yes, you need to manage the infrastructure, but now it's a software configuration task instead of a physical maintenance one.
There are certainly companies and problems for which colocating still makes more sense than the cloud but 9 out of 10 of these articles completely gloss over most of the costs. This article is one of them.
To raise a more constructive point. Looking at [1] (an earlier post in the series), it looks like their workload is WordPress + VBulletin forums, and "pets" not "cattle" -- I wonder how much more they'd be able to get from separating their stateful and stateless layers more cleanly, and using some of the more powerful (and cheaper) AWS primitives for serving traffic that involves the latter. Why do they need such beefy instances for essentially serving content?
The truth is, it's because the platforms they're using (vBulletin + WP) exist, they're very powerful and they get the job done. This leads me to believe that the operational UX for running these kinds of common applications (which compose most of the internet) on AWS without highly technical supervision is not great.
[1] https://www.servethehome.com/falling-sky-part-3-evaluating-a...
On a large cloud provider a government notice is enough to remove access to compute resources and data hosted in it. This in essence means company or individual using this cloud services neither own the compute resources and data hosted in it.
On the other hand in self-hosted infrastructure government notice will not block access to resources owned by you and data in those resources. You can respond to notice and continue using those resources until court rules. So it is more freedom than a cloud can offer.
Obviously given the large marketing budget and convenience cloud provider flourish at the expense of freedom.
Self-hosted infrastructure needs a new renaissance given whats happening around the world and how government and large corporation taking away freedom one bite at a time. So it is very essential for development and freedom for humanity to have self-hosted compute, networking and data infrastructure.
As far as I can tell, all traffic is being routed through two servers. These servers are running Linux? What happens when a routine software update bricks your routing (it could happen to anyone). Without remote access you need someone to go in and fix this. Do you have after hours access? Is someone nearby on call to respond to complex issues in person? Not all physical problems can be solved by a random colo tech.
Colocation (as pictured) lacks management of a lot of variables that could lead to big problems. When your downtime targets are in minutes per year any incident requiring physical response is unacceptable.
A simple Google search reveals a service philosophy compatible with this hosting[1]:
> As with all web systems, at some point, downtime is required.
For the rest of us there's "cloud" hosting.
[1] https://www.servethehome.com/pardon-dust-upgrades-progress/
The ones comparing AWS to VPSs even colocation think AWS is just short sighted. If VMs and bandwidth is your only case, yes it is expensive. Skipping the marketing talk, AWS has great services like SQS, S3, Dynamo, SNS, Kinesis, Firehose and hosting open-soure or paid alternatives of them requires a lot of engineering power, effort and extra monitoring, restless nights. I prefer our engineers to work on features, not infrastructure and only deal with it when it is required financially.
Most organizations are fine with vendor lock-in when vendor is stable (not deprecating, not increasing costs). Money is always an issue but its relative, there are many factors in real business. Some can afford cloud, some cannot.
I also agree that you would be fine if you did not use brand new X service that does AutoML, any shiny new over marketed features for you etc. but comparing AWS to Colocation is just one dimensional thought.
- the sheer number of services available
- managed services
- high availability, within and/or across regions
- APIs/SDKs for many of their services, for many languages
- monitoring and alerting built-in
- RBAC
- means of grouping and organising resources and subscriptions
I work as an architect in the enterprise space, where those last 2 are pretty important because a lot of different teams are building/using a lot of different systems, and often multiple organisations are involved. High availability of some services is vital, because downtime can affect the bottom line, or mean an entire workforce has to down their tools.Considering only the first 3 items, most of the non-huge systems I design - and, I would imagine, most non-enterprise/FAANG-scale systems - have only modest availability requirements, very rarely needing cross-region availability. And most of those systems only really need web apps and APIs, proxies/gateways, a database, a message queue, blob storage, and often some form of background processing (VMs, containers, serverless). OK, that's still a fairly long list, but it's a tiny fraction of what the big cloud providers have on offer, and the database is probably the only tricky thing on the list.
Cloud stuff makes sense when you have some combo of the following imho:
* Need for scalability - at one place I worked high-load was 3x low-load so going up and down saved a lot of time
* Rare need for powerful resources (sort of the same as above) - I provisioned a real fatboi on AWS the other day to do some file processing. Worked like a charm and gone in the hour
* Ability to use higher-level primitives - lambda vs. EC2 instances, databricks vs. EMR, IAM/Kinesis all that
* Need for exploration - I could not afford the peak ElasticSearch + Redis + Postgresql that I had but I can trial it
* Need for resilience - If you need to preserve the data/compute for whatever reason, AWS is way easier to get to a good place
* Lack of knowledge on top-tier ops - I've run long-term dedicated servers for almost two decades now, with top uptime in a decade. I have fuck-all knowledge of doing smart colo ops. If you want to keep your team lean and you don't have this to lean on, it's going to sink you.
If you're using hardware as hardware and it isn't changing that much and you have the skillset, you can get a long way with a colo even today.
If hardware breaks in your datacenter, it's an emergency and you better have spare capacity or extra hardware laying around.
$737.19 of transfer out is 8TB per month at standard EC2->internet prices, which is fine; something like 25 Mb/s on average.
$3558.56 for EC2 is something like 20 x m5a.2xlarge ... and that makes no sense to me. That's 160 vCPUs and 640GB of RAM.
How does a WP site like that possibly need that much compute? The whole thing feels like it ought to be a pair of instances plus $20 a month for Cloudflare.
I'm sure STH has more visitors though.
S3 for backups is the single biggest ease-of-mind feature AWS provides, IMO. I do not want all my data sitting in a single colocation facility without backups.
If I run a database on RDS, maybe it costs more than a bare-metal server, but I can replicate the automatic hourly backups to as many regions as I want. Incredible peace of mind over a self-managed facility.
I'm biased, but in my opinion if your infrastructure is on AWS and your backups are on AWS you're doing it wrong. Even accounting for zones and geo-disparity, etc.
Luckily:
ssh user@rsync.net rclone s3:/some/bucket /rsync.net/home
... which is made possible by rclone[1], which is excellent, and the fact that rsync.net has rclone built into the remote environment[2] such that you can execute it over SSH.Fortunately we have tools like borg that is stable and does your backup encrypted easily.
My mind only rests when I have my data encrypted locally before being sent and are put on different providers and regions.
Cost of not trying to learn because clicking seems easy is a cost by itself.
It's still the same app the load balancer can only send the traffic to one machine. RDS now means DB connections are TCP not pipes. The EFS is slower than an SSD. So it's slower and costs more.
Why did this happen? Because an external consultant said AWS is what the big players use. I argued that our app needs to be engineered to take advantage of AWS or this is an expensive waste. But what would I know, we went with the consultants and payed them tens of thousands to set up a bunch of AWS infrastructure I could have run on a $10 a month VPS...
The hardware part is not difficult, if you've built a PC you can set up a rack server (it's arguably easier because they're built to assemble without needing tools), and the software is all Linux on both, no? Or do you mean hiring on-site engineers within driving distance of your rack servers is difficult?
It's also really pretty amazing to be able to spin up instances or services to test with for however many hours you need and then simply release them again.
Once you've got used to that level of freedom and convenience it's hard to go back, everything just moves faster.
For example, we're able to provide GPU instances (https://lambdalabs.com/service/gpu-cloud) that are half the hourly cost of AWS hourly on-demand pricing. How? Because there are huge markups on clouds services.
We've done such extensive benchmarking and TCO analysis and the jury is out: it's simply less expensive to run on-prem. You're just paying for convenience when using a cloud. GPU or otherwise.
Sources:
- https://lambdalabs.com/gpu-benchmarks
- https://lambdalabs.com/blog/hyperplane-16-infiniband-cluster...
I used to work at a place that would borrow hardware from one client in order to handle the load on another. It was for seasonal stuff, a few days a year when we knew someone would melt down if they didn't have double or triple the capacity, but they didn't need it the rest of the year.
Now I deal in auto scale groups and don't care how many servers we use or need as long as it isn't wasteful. "Hardware" that isn't ours is so easy to swap out, try a different size or scale that is hate to go back to metal.
The cloud makes you addicted to the "but you have to scale" by selling 2 to 4 vCPUs plus 8 to 16GB of memory for the same price you can get a 12 vCPU with 64 or 128GB somewhere else.
You have to know your business and you need people with facilities skills. Investing in cold aisle containment reduced our PUE quite a bit. You also have to become an expert in logistics to run your own facility, where just almost anyone can click away in AWS and spend money.
Yes, AWS is expensive. And depending on your workload you can certainly save a fair wedge of cash by colo-ing things.
For wordpress, (the site that is running serverthehome) its perfectly possible to run it in a lambda, with aggressive caching and save a whole bunch of cash. depending on plugins, load and a number of other things, it could be a potential saving of 90%. (plus a massive reduction in attack surface)
Before you ask, yes I do know from bitter experience, parts of the financial times had WP wedged into them. making them fast and secure was an interesting experience.
AWS is only cost effective when you are using ec2 with a duty cycle of less than 50%. This means that you are not using AWS for hosting 24/7 compute.
In terms of storage, There are two compelling offerings, S3 and EFS/lustre. however if you're a large scale EFS user, its better to run your own GPFS system.
The hardest part of "scaling" is orchestration. If you're using K8s, there really isn't much difference between running it on virtual vs real steel, barring bringup scripts(don't get me started on networking in K8s, its totally warped.)
TLDR:
IF you need 100+ machines on 24/7, AWS is going to be more expensive.
If you have transient loads, and the average time on for a machine is 4 hours or less, then AWS is for you. However, you'd better use all the other bits that come with AWS to make it cheaper, like fronting it with fastly/otherCDN
Using docker or k8s just doubled the costs over just using disk images and custom load-balancing daemon with nginx (with 3rd party modules for changing upstreams without restarts/reloads)/python, on top of adding more networking complexity (like to caching/search/rabbitmq instances). Probably around 20+ machines at peak when i left, but kept them all below 20% cpu. Sad to see so many places go in the docker/k8s direction.
I'm so shocked. /s
Let me know when you compare the cost of running all of AWS service offerings self hosted.