This seems very reasonable to me. I thought it was going to be a pitch for on prem, which is also fine for certain scales.
I think generally the scaling steps from startup to megacorp go:
Heroku/Dokku > Public Cloud >Dedicated servers in someone else's DC > Custom Hardware in custom built data centers.
Each makes sense at each scale. I find it to be more of a right tool for the job consideration than one being better than the other.
With modern cloud tooling your infra can also look more or less logically the same once you grow past the heroku level.
I think stackoverflow and its siblings proved that having a handful of servers can go a very long way, even making cloud ops redundant.
Of course this is function of what you're optimizing for, and whether you want to go down the "boring monolithic app" route.
Microservices do add some overhead but it's not extreme - even a microservices-based app can run just fine on a decent bare-metal box if you run all the services on it.
Of course, the talk of microservices brings the question of what problem you're actually trying to solve - are you aiming to build a technical solution to a business problem or are you aiming to create an engineering playground so there's endless busywork and justification for hiring lots of engineers? If it's the latter, then bare-metal is going to be a bad option anyway as it's not the kind of toy a typical startup engineer wants to play with.
There's also something to be said for buying a VPS or a Colo machine, making sure it's backed up and dealing with the 9's that you get from that machine on it's own. I am routinely surprised by how far a single node machine will get you.
It costs a lot of money to run your own datacenters, and very very few companies are capable of doing it as good as AWS or even Scaleway/OVH can. By that I mean, waiting weeks/months to get through tickets, approvals, multiple different teams just to get a server deployed. Then waiting a few more weeks for monitoring/backups.
Allowing developers and related to have hardware/software at a whim is a massive advantage.
This. I took the wrong lesson from the DDoS attacks on Linode in late 2015 (particularly the one on Christmas Day), and the intermittent issues I encountered with DigitalOcean and Vultr in 2016 while both providers were still fairly young. A single dedicated server from a mature provider (ideally not during its hyper-growth phase) is pretty reliable.
Many mega-corps are extremely bloated and dysfunctional. Their IT (Private) Cloud teams slower and less competent.
With public cloud, a small team can be fully responsible for all their resources with crystal clear cost accounting.
The right scale is Amazon, Google, Facebook, Microsoft. Likely much fewer than a hundred companies in the entire world.
Or laged features.
Or had underlying infra issues.
With AWS in one Startup I was able to build and maintain infrastructure were you needed a small team just 10 years before.
Why is it surprising? Building and maintaining custom data centers is a big, slow business initiative. It takes months to years of forecasting to get the data center buildout to match the business needs, as opposed to the extreme flexibility of using a cloud provider.
> There's also something to be said for buying a VPS or a Colo machine, making sure it's backed up and dealing with the 9's that you get from that machine on it's own. I am routinely surprised by how far a single node machine will get you.
For personal projects this is exactly what I do. It’s great until something goes wrong with that one machine or VPS.
But it’s not really a good option for any business that needs consistent operations and uptime. Years ago I worked at a company that tried to self-host some of their collaboration tools on a VPS to save money over the cloud-hosted versions. When the server went down it stalled productivity for a day while the team restored a backup, with another week of confusion as we tried to find all of the things that were lost between the last backup and when the server went down.
When someone did the rough estimations on how much it cost to pay everyone’s salaries for that day of lost productivity, the number was far higher than the trivial cost savings we got from self-hosting. We also had a constant background burden on someone internally to maintain and monitor the server, plus the burden of them being on call. Often, moving to cloud anything can be a huge load off the company’s back.
For example:
datacenter - aws: us-east-2
Dockerized Webservers/task servers: Render or Engineyard
Postgres & Kafka: aiven or 84codes
Redis: Redis labs
Unified logging Elastic or Grafana
I still end up using some underlying AWS services like S3 and lambda, but it's a lot less work than managing an entire AWS ecosystem with security groups/VPC/networking etc.
Thing is that for most/many startups 100k users is not a lot. Rejiggling your basic infra just as your growth is starting to accelerate is a non-trivial task, a risk, and something that doesn't fundamentally move the needle.
Depends; if you're a startup offering free or ad-supported services and the exit plan is "be bought out by existing entrenched competitor", then, yes, 100k users is not enough to hit your goals.
If you're a startup offering B2B services, even 10k users is enough to be madly profitable.
nowadays, i'd say use fly.io or render till you have 200k users.
It may work for a simple website, but for any more complicated project with web clients, mobile clients third party integrations, migrating from Heroku to cloud provider to on prem means refactoring big parts of the project.
What is even bigger problem a migration like this is hard to do incrementally.
Use a managed PaaS to begin with (you pay more but it does genuinely save you time as there is no management overhead), then when you're ready to do things yourself go straight to hosted bare-metal, and only use public cloud services for their managed services that you can't replicate yourself (think Redshift/Athena/Aurora/etc).
In my experience the maintenance overhead of the cloud is much lower. My dayjob (B2B SaaS) spent about 75% of the infrastructure team’s time on things like patching switch firmware, balancing UPS loads, diagnosing flaky switch ports or transceivers, managing logging growth, etc. None of that made our products better from a customer perspective.
Since our cloud move those same infra staff support many more services and apps with much faster turnaround for product teams. And we traded upcoming multi-million capex investments in servers/switches/appliances into a monthly cloud bill that scales much more closely with revenue.
The public cloud is for businesses constrained by people; we simply could not afford to hire enough people to do the same stuff on-prem or in colo.
- Terraform to create the IAM policies: 4 weeks
Perhaps it's because I am very familiar with the aforementioned tool and cloud but 5 weeks for writing those resources gives me the impresion of:
1. Lack of experience on AWS.
2. Lack of experience with Terraform.
3. Both.
I don't want to sound arrogant by any means but a Terraform project for something like that, documented, with its CI and applying changes via CD, would take me 4 days being generous.
I more or less gave up after a month of beating my head on the brick wall. We hired an expert. Took him another month to get it all more or less sorted. There were still aspects that we wanted that we could not get Terraform/GCP to do.
In the end, we dropped Terraform and went back to modifying the GCP manually.
I've deployed similar, additionally including GKE, via terraform in a day - Checking TF code for an example 3-env GCP/GKE/CloudSQL stack it's less than 300 LoC
That said, it's not all good - my ongoing complaint with terraforming GCP is that the provider lags behind the features & config available in GCP console - worse than the AWS provider - especially w/r/t GKE and CloudSQL
The only thing that could make that tough is if you put the Lambdas in a VPC. That can get tricky because you have to plan out subnets and whatnot but still not a week.
The AWS documentation is also extremely good with regards to what properties are on each resource. I can't speak for Terraform since I usually use CloudFormation / SAM directly. Maybe it's a Terraform problem?
Yeah, it’s about 20 minutes if you use the VPC and Lambda modules from https://github.com/terraform-aws-modules. I could see a week if you had to learn all of this first with little prior experience but that’s true of everything. A newbie running a Linux colo server isn’t going to get all of the security & reliability issues right in less time, either.
But if someone gave me the same use case as the author. I wouldn’t suggest any of those tools. What’s the business case for introducing the complexity of AWS for someone who is just trying to get an MVP out the door who doesn’t know cloud?
I’ve been in the industry for 25+ years and only first logged into the AWS console in mid 2018. I had a job at AWS two years later. That gives me a completely different perspective
TF is not Infra as Code it’s infra as configuration files and it’s a mess.
I haven’t used Pulumi but that’s kind of what I really want. Give me Python and better abstractions to gcloud cli.
Hard disagree.
- On cost: there is almost nothing better for the indie hacker, bootstrapper, or startup than cloud services.
I run apps on all three platforms (Google, AWS, and Azure) and my monthly spend is less than $2.00 < month using a mix of free tier services and consumption based services (Google Cloud Run, Google Firestore, AWS CloudFront, AWS S3, Azure Functions, Azure CosmosDB).
- On complexity: if you've used Google Cloud Run or Azure Container Apps, you know how easy it is to run workloads in the cloud. Exceedingly easy. I can go from code on my machine to running API in the cloud that can scale from 0 - 1000 instances in under 5 minutes just by slapping in a Dockerfile _with no special architecture or consideration, no knowledge of platform specific CLIs, no knowledge of Terraform/Pulumi/etc._
The current generation of container-based serverless runtimes (Google Cloud Run, Azure Container Apps) is pretty much AMAZING for indie hackers; use whatever framework you want, use middleware, use whatever language you want. As long as you can copy/paste an app runtime specific Dockerfile (e.g. Node.js, dotnet, Go, Python, etc.) in there, you can run it in the cloud, and run it virtually for free until you actually get traffic.
If any of the projects take off, then pay to scale. If they don't take off, you've spent pennies. Some months I can't even believe they charge my CC for $0.02.
If you're never planning on scaling past a hobby project, the free tier is a great place to stay. If your hobby project "goes viral," though, it might cost you a few thousand dollars, but hopefully that helps you get a lot more money to turn your hobby into a business.
If you have commercial intent, however, $50/month goes from an expensive hobby (3 streaming services) to a very cheap business. At that point, the fact that you don't have to pay for scale on DO VMs and other platforms actually makes a lot more sense. You can sleep at night knowing that you will still have a business even under a load spike, and $50 of digital Ocean buys you roughly the compute power of $1000+ of AWS managed services.
Google Cloud Run, Azure Container Apps, and AWS AppRunner (less so because it doesn't scale to zero) are really great tools for hobby devs and small shops.
Wish I'd know about this AWS free tier, because that sounds a lot like my monthly digital ocean bill :')
So right away that “I can’t believe they even charge my CC for $0.02” is real suspect. Do you have a completely empty AWS account?
We haven’t even spoken about dev experience yet.
The problem is that you're using EC2 instead of AWS App Runner, Google Cloud Run, or Azure Container Apps.
> We haven’t even spoken about dev experience yet.
I'd strongly recommend that you give Google Cloud Run a try. You can go from empty codebase to running, on demand serverlesss runtime via GitHub with only a Dockerfile. I can build an app from scratch and have it running in Google Cloud in probably under 3 minutes with no special CLI knowledge or build.
Here's a sample Dockerfile I'd need to get a dotnet app into Google Cloud Run:
# The build environment
FROM mcr.microsoft.com/dotnet/sdk:6.0-alpine as build
WORKDIR /app
COPY . .
RUN dotnet restore
RUN dotnet publish -o /app/published-app --configuration Release
# The runtime
FROM mcr.microsoft.com/dotnet/aspnet:6.0-alpine as runtime
WORKDIR /app
COPY --from=build /app/published-app /app
# The value production is used in Program.cs to set the URL for Google Cloud Run
ENV ASPNETCORE_ENVIRONMENT=production
ENV IS_GOOGLE_CLOUD=true
ENTRYPOINT [ "dotnet", "/app/my-app.dll" ]
Every other aspect of the code remains unchanged. GCR will pull the code from GitHub, build the container, and operationalize it.heroku, a vps or a dedicated server are all in the cloud, not sure what you mean by this.
Sure, I also have plenty of static websites hosted for free by vercel / netlify / heroku / yourpick and even free functions.
As soon as you start hitting traffic, functions start to cost a lot vs your own vps.
My ideal setup right now is free static hosting from the marketing budget of friendly saas, free cloudflare on top and then APIs hosted on small vps (I have plenty of stuff on digitalocean but if I were to start from scratch I'd go fully with hetzner).
I avoid the big 3 as much as I can and I laugh for hours when I see the bills of clients using them.
- 2 million requests/mo free
- First 180,000 vCPU seconds free
- First 360,000 GiB seconds free
Then:
- $0.000024 /vCPU seconds
- $0.0000025 /GiB seconds
- $0.40 per million requests
This will get you pretty far for $2/mo. Within the free tier itself, assuming you can process each request in 250ms on a 1 vCPU container, you get 720,000 requests before you start paying for compute usage. Each $1.00 is another ~38,000 vCPU seconds (@1 GiB second) or ~152,000 requests @ 250ms per request.
Roughly speaking, $2/mo. is 1 million requests @ 250ms each request consuming 1 GiB seconds on a 1 vCPU container.
(There's some nominal cost for egress and storage of container images).
If you are a startup trying to get a product to market, AWS is typically going to be a very small cost unless you are doing something very compute intensive (in which case something like Heroku, which the author recommends, certainly won't be cheaper anyway). The high bills only come later, if ever, after you've decided to create 20 databases and 50 apps for your 70 person startup.
I based it on this CFT
https://github.com/1Strategy/fargate-cloudformation-example/...
And this walk through for C#. I had a similar walkthrough for building a container for a Node service.
https://aws.amazon.com/blogs/compute/hosting-asp-net-core-ap...
That takes a Dockerfile, manages networking, secrets and CI/CD deployment. I have a few quibbles with what it does, but it generally works and is being maintained/updated.
It wouldn't be unusual for a tech lead to pick some approach that ends up being new for the rest of the team. So some ecosystem with fewer choices would probably be faster.
This seems to be flirting with the idea that Amazon has become a required component of hosting an application on the internet.
There are plenty of good alternatives, but AWS is the 800-pound gorilla. You have to know at least a little bit about it in order to know why not to use it.
It's like saying you don't want to use React/Angular/Vue for your web app. There are good reasons not to, but at this point you should at least have some experience with web frameworks before making a technical decision not to use them. If your answer is "I don't know them and I don't want to learn them", that's fine for a personal project, but probably not a reason not to use them at your full-time startup. If your reason is "I know React, but for my specific use case, vanilla HTML/CSS/JS is better" then you are making a more informed decision.
I'd prefer to hire someone dedicated to that and just let them work part time when the environment is simple over a developer with just the basics who's going to try to architect and run everything.
And me thinking we got to the cloud to get rid of the BOFH.
Lol. This may be true but if kind of pointless as an api on localist isn’t very useful unless you’re automating your home. Of course it’s easier to hack something out on localhost than to design for actual users.
I think it makes more sense to build incrementally with the end in mind. So writing those terraform scripts will take less time if you initially write them to deploy to localhost for testing.
Either way, Lambdas are hard to debug locally, often I just deploy them to test (since deploying is easy). Or I write my code such that it bootstraps differently when launched locally vs Lambda. Either way, unless it is a very complex app that has lots of external dependencies, 4 days is a bit much.
Or try to add more than 100 rules to your ALB, because it’ll be impossible.
My biggest issue with AWS is that the limits are so arbitrary, and seem to solely exist due to terrible design decisions.
If my local express server, or nginx can deal with 100 endpoints, how is it possible for this multi billion dollar infinitely scalable service to not do the same…
When did developing software on your own machine stop meaning "design for actual users"?
You should have a strong and reliable deployment for production, yes. But not being able run a baby instance locally just as easily means sacrificing your development loop.
The article talked about how much time it takes to get working so it seems like the author took shortcuts to get it working locally.
I agree that it’s a good practice to dev so deploying locally works as well as deploying remotely (or to lots of environments). But this is different than developing only for localhost.
Nobody knew what all ran on that server, worse yet nobody knew that particular service ran on it. The person who wrote it was long gone.
It took a day to troubleshoot, a day to figure out what actually happened, and 5 days to get the server backup and running.
A couple months later, someone shut the server down again. It only took three days to fix it the second time.
In order to ensure this would never happen again, there were about 15 meetings, 20 people were involved, and then service was re-written and hosted on Azure (with the rest of some of our stuff). It's probably failed about 100 times since then, in about a hundred different ways.
This thing was maybe 500 lines of php.
Stockholm syndrome à la Big Cloud.
It's okay to be interested in elaborate cloud architecture things and learn them because of that, but don't sell it as one-size-fits-all thing that every little company needs.
Most companies don't need that complexity, but of course, Big Cloud with their billions needs to convince you otherwise.
However, to be cost effective, you need to adapt your application to be more cloud native using their propietary SDKs. Azure Functions/Lambdna, CosmosDB, Blob Storage/S3, etc. The application gets cheaper, but you've now also bought yourself into the ecosystem and you're never migrating anywhere else.
And now the pricing increases. Or the cloud provider decides you shouldn't be a client anymore. Too bad. No easy way back.
There is still not much wrong with a webapp on a VM. You still need sysops, except classic sysops instead of cloud certified sysops.
OTOH you can pick a managed datsbase: you just get a connection string to a Postgres with failover and backup already taken care of. Same with queue services, email services, etc. They have really simple APIs.
You only need platform-specific knowledge when you start operating at a larger scale. By that time, you likely can afford to hire a dedicated SRE.
You can replace "couple VMs" with a dedicated Hetzner/OVH/Kimsufi server, it'll be the same except you won't get ripped off on egress bandwidth and performance.
The author's not wrong. Cost comes with lack of accountability in my experience. In turn, my devsecops dept (~20 people) has kept costs down by holding monthly AWS accountability meetings. "Who owns this and why does it exist?" is the leading question.
> The all-you-can-eat buffet problem
Valid point. But I've gotten far in my career by specializing in AWS. It's not going anywhere soon. It's the one cloud provider I would say you should go all-in on. Azure maybe next. GCP? Come on. Conversely, I just got an email from Heroku saying they're retiring one of my free-tier databases that I still use.
> Culture of simplicity eats strategy of complexity for breakfast
Orgs, please retire this saying. I hear this everywhere. It's lost a lot of meaning. Just spell out what your org does better than the rest of the pack.
What's wrong with GCP?
That being said, in my personal life if someone ever came to me and said that they were starting a project from scratch or even if I were starting a hobby project from scratch where I saw Lambda + DynamoDB wasn’t the right answer, I would just use Lightsail and simple monolithic application using whatever frameworks are appropriate that I already knew.
AWS Lightsail is a simple fixed priced VPS. I’m not advocating using Lightsail over another VPS provider. It would just be my preference because I know how to transition to full fledged AWS later.
I come from a .Net background , so feel free to reach out.
I’m going to speak from an on the ground hands on keyboard implementation person.
AWS offers plenty of hosted versions of open source solutions and API compatible services like DocumentDB with Mongo compatibility.
If I’m working with a customer that prefers the open source solution and there is an equivalent on AWS, I’m going to suggest that. My goal is never to introduce too much new technology to an organization unless there is a compelling need.
I’ve recommended everything from a straight lift and shift, to hybrid, to full on all in on AWS depending on the use case. I’m not dogmatic and I’ve never been told “get the customer all in on our services so we can lock them in”. I’ve implemented pure AWS CI/CD solutions, integrated with Azure DevOps, done lift and shifts with Jenkins, etc.
I’m judged completely by outcomes and whether the customer is satisfied.
But I’ve been railing against worrying about “lock-in” wat before coming to AWS. I’ve been part of numerous large scale migrations and implementations. If you’re at any scale, you’re always both technically and organizationally “locked in” to your infrastructure choices and migrating involves, dealing with CxOs, PMO, retraining, security, regressions, etc. It’s usually much easier to just have a conversation with your account manager.
I need to do GPU inference but I don't want to run the machine 24x7. I may use it for about 4 hours per day at best. Lambda doesn't offer GPUs and neither does ECS+Fargate.
It seems like I could setup an endpoint using Sagemaker and then destroy it when no longer needed, and automate all of this but it feels quite messy.
The other route is perhaps I can launch an instance every day with ECS and then get rid of it.
All these routes seem quite inefficient. There seems to be something called Elastic inference where I can provision the right amount of GPU resources - but it seems like I'll need a spare EC2 instance to do that if I'm not mistaken, which is not ideal either.
I guess all this stems from the fact that there is no straightforward virtualization for GPU workloads and so they have to provision them 1:1 which currently they are not equipped to do.
Has anyone run into a similar problem and found a more elegant solution? All of the above are very messy. Is there some obvious choice I am missing?
SageMaker might have an abstraction which is a closer fit for your particular use-case, but I'd be wary of potential cost excesses; running on raw EC2 and automating the lifetime somehow is inevitably going to be the cheapest route.
Yes, there’s a steep learning curve. But once you’re passed that (or if you gained that knowledge in a prior role) AWS can easily hands down be the easiest, cheapest, and fastest infrastructure platform to use.
…if you know what you’re doing.
If you don’t know the ins and outs of AWS, then yes, you probably shouldn’t use it for your next MVP or startup idea.
We’ve found at work that if you already have the talent, the hyper scale cloud platforms are amongst the most expensive ways to manage infrastructure if you go all in.
For example $0.40/secret/mo is _expensive_ compared to the cost of an HA vault (not necessarily Hashicorp) setup. If you have 1,000 secrets but you only need to access any given secret once a day, that’s a lot of expense against just setting up your own. And then you can take it with you.
Beyond that, we’ve had a LOT more reliable performance from our current VPS provider than we ever got from EC2.
That’s not to say AWS is exactly without competition. We use S3 extensively because nothing compares for our usage.
Sure, if you're in a startup and you're doing most of the infrastructure and operational work yourself then working on-premise is often advantageous. If, like me, you're working for a Fortune 200 company and it takes multiple ServiceNow tickets to get on-prem hardware, a lead time of several months to get it through procurement and subsequently racked and stacked, and working with infrastructure solution engineers throughout the process - trust me, AWS is a much better choice and will enable your team to get stuff done.
If you are working for a startup then beware, as you grow avoid the temptation to build a data center - go to the public cloud. I would argue since that's where you're going to be hosted anyway - assuming your successful growth - then you should really consider just starting out there in the first place.
What's stopping them, after they "embrace the Cloud," from making it take multiple ServiceNow tickets and several months to change an IAM policy? This has been my experience in very large corps that do use AWS. Typically it's also made a violation of policy to use a team-specific cloud account.
P.S. After having helped a mid-sized company migrate some core functions from DC to cloud, I agree with your startup advice.
“I didn’t get into the cloud to avoid administering servers . I wanted to avoid server administrators”
I thought that one day I’d interview someone who wrote a post like that and figure out for real. But I’ve been waiting decades so it probably won’t happen.
“Everything you know is wrong.” Yadda yadda yadda
I wouldn't be surprised if in some cave somewhere there's a message on the wall like "Think Oogie is best stone tool maker? Think again!"
I think fargate + docker is super easy to setup, run and maintain. Maybe Heroku makes it a little bit easier, but that's about it. Once you leave the Heroku ecosystem you'll have lost all the time you saved.
I'm not convinced.
Fargate, image repo, NAT gateway, I don't even remember all the nonsense at this point but it was ridiculous.
I took a look at a few alternatives and DigitalOcean's App service was night and day easier, faster, and cheaper to deal with.
This is the only line that actually matters. Are there better and cheaper options for your organization? Almost certainly, but no one ever got fired for picking AWS.
It’s also nowhere near as bad as Concur/Oracle/IBM. Though that’s a fairly low bar.
I’m building a PBBG and have it on AWS. Built the site in .NET with SignalR, MartenDB/PostgreSQL. Needed to host it.
2 evenings to write a cloudformation script which builds a VPC. Public and private subnets. RDS. A tiny instance to act as a nat gateway (for servers in private subnet). A small server with HAProxy for load balancer. A tiny server for redis. 2 web servers. Plus some sh scripts to build the project. Zip. Ssh jump to the server and deploy. Total cost is like $42/m.
Used a tiny instance for nat gateway cos aws nat gateway costs $32+ingress. Used tiny instance for redis as it’s only used for signalr connections across servers and services. Tiny instance for HAProxy cos aws application load balancer is $15/m. I could consolidate 3 of the servers into 1 but I choose not to.
You can build flexible things on AWS. The problem with AWS is it’s very easy to just spin up random stuff and not care about cost and blow out a budget quickly.
AWS NAT gateway is certainly pricey. And doesn't even support NAT traversal which means P2P/STUN stuff is out (this is in contrast to GCP which does support it).
C'mon AWS you can do better!
- Time it took to write that API on localhost: 4 days
- Time it took to learn Terraform and some AWS services, to create the API gateway, database, lambdas, queues, Route 53 records: 1 week
- Time it took to learn AWS IAM policies to create the IAM policies in Terraform: 4 weeks
Author conveniently left out a few bits of information. Once you learn IAM and Terraform, I'm doubtful it takes you another 4 weeks to setup the policies for a new project.
Start on Heroku, maybe with your own RDS. This removes so many decisions and ongoing overhead and lets you focus on building the thing that actually delivers value.
- Yes, if you don’t have AWS, Azure or GCP experience it can be hard. Harder than it should be. But this is why I try to make things simple. Run node / express in lambda. Use managed services. Use CDK so the IaC abstraction is easier. Definitely not 4 weeks for the IAM policies.
- You get tons of credits as a VC backed startup (in all providers) so cost is not that much of an initial issue
- Yes you need to pay attention on the expenses, setup budgets and budget alarms, and run cost optimizers often
AWS is a fools errand at the startup level unless you need some of their specialised services. Stick to "tier 2" cloud providers like Digital Ocean or Linode. If all you need is servers, database and storage, then don't waste your money on the major cloud providers. They are the wrong choice for basic compute.
In Netflix more than 10 years ago, it's more like this: a single engineer builds a deployment/management tool: 1 - 2 months. Every other engineer creates a new and fully configured cluster: minutes.
Seriously, can we please get over the fetish of using anything this DSL that YAML or whatever "specification language"? Such tools are powerful, flexible, but should not have a place for engineers who just want to provision resources. The tools violate almost every UX principles, in particular the following:
- Discoverability. Very little. One has to read tons of docs and SO posts to figure out what needs to be done. You want to pass in some environment variables? A typical answer from those who use Nomad/TFE: easy, just pass in this 200 lines of Jinja template. Really? Really? You call this ease of freaking use?
- Affordance. None.
- Constraints. If you call the errors only after you submit your 1000-line yaml scripts.
- Consistency. Maybe, but still, embedding a Jinja template to pass in variable is an insult to UX.
It's a unrepentable sin to ask me to learn your shit.
In fact, I've already migrated a fair chunk of my workload off AWS Lambda onto constantly running fly.io VMs.
It's significantly cheaper than serverless (when you're past the free tier), the servers just restart if they crash (as opposed to running up a six figure AWS bill), and it's less complicated operationally (it's just a VM, less need to pipe messages with SQS, figure out IAM, etc)
A more useful article would actually walk through the cost/complexity trade
If that's starting to be not enough, then consider yourself lucky, and start scaling.
"Premature scaling is a root of all evil" (and a good source of profit for Amazon, but I am repeating myself).
While this is true-ish (finding lost resources or just not creating them in the first place is not that complex), when the day comes that you will need/want to move to one of the cloud providers you will need a good devops to handle that hybrid cloud environment and making the transfer as painless as possible. cross cloud routing, DB migrations and not to mention setting up secure access for all of it is less complex in my POV then cost managing your cloud account
I can understand why AWS would try to steer people away from Kubernetes since it has a commoditizing effect. However, it could end up steering people away from AWS entirely.
If you're familiar with AWS, use it and get running and focus on delivering product instead of fine-tuning all the settings and worrying about the perfect cloud environment.
My honest opinion, using some third party tools/services on top of AWS only creates tech debt. Over the last two years, AWS has improved a lot of their tools to be incredibly easy that any developer can use it. Developers should be able to understand how their applications run in the cloud provider as well.
Step 1: Write an incendiary title Step 2: Make definitive statements meant to be applied broadly but actually targeted at a specific situation that the author is experiencing, or comes from the author's own problems with something. Step 3: Make sure to cast doubt and insecurity, making somebody else feel like they made the wrong choice, even when everything is working perfectly Step 4: Street cred improved!
If you hire people who don't know how the cloud works, then of course their time will be sucked up by learning how it works. If you hire people who know how the cloud works, it is a productivity multiplier. Use the tools that you know.
BUT. If you have to build a giant wooden sailing ship, and all you know how to use is a Swiss army knife... and you want to get that ship done this century..... you need to learn new tools.
Compute and egress costs can be prohibitive at scale, but features like storage + bigquery (OLAP SQL db where u only pay for queries) are basically free for low-to-moderate volume workloads.
My response is "yes but what about databases". There are any number of ways of hosting my application - "put it on a VM" is a perfectly reasonable approach, particularly as my preferred platform is Elixir which is pretty monolithic anyway. That's fine for the app, but what about the DB? I think I know just enough to know that hosting Postgres properly (i.e. reliably and performantly with appropriate backups) is not that easy, and I'd like someone else to do it for me.
For my startup I used App Engine (Flexible Environment) and Cloud SQL. That worked well in that I had the "two instances behind a load balancer" that you want for seamless upgrade, and managed SQL without having to delve into all the many Google Cloud services for networking etc. etc. Everything else was just 'more elixir' which is easy to test locally.
The first part I think most anyone would agree with.
The 2nd part doesn't make sense. AWS/Google etc all have simple ways to setup a web app & database without messing with containers, microservices, event architecture etc.
As for cost, they all offer generous free tiers for learning & hobby projects.
If you want to make a business out of it, check out their also very generous programs for startups with over $100,000 of free services. Azure is over $150k.
https://aws.amazon.com/activate/ https://www.microsoft.com/en-us/startups?rtc=1 https://inthecloud.withgoogle.com/startup/dl-cd.html
Digital Ocean, Cloudflare and many others also offer great incentives to get you to build on their platform for cheap or free in hopes that you succeed & stay with them.
Their CloudFront service is pretty awesome.
You just put it between your server and your users and BOOM all your static content gets delivered lightning fast via Amazons world wide CDN.
And it costs like $10 per month millions of requests.
> Using the shiniest new technologies is rarely the cause for success, it's usually the result
> Culture of simplicity eats strategy of complexity for breakfast
Until you get some unicorn that says "we are profitable at price $X and all of our competitors are losing money at $X + $Y, and it's because of our software architecture and infrastructure choices", nobody is going to be convinced.
I don’t know why the author singles out Lambda. For many use cases their ongoing maintenance is close to zero.
Less than $5/month. Yes, on AWS. Serverless (the genuine kind, which scales to zero with pay-per-request) is pretty much free until you have actual users, and once you have actual users, you have actual revenue to pay your cloud bills. Unless your ($revenue / $hosting_costs) is less than 1.0, in which case, you don't have a business.
> "(LISP) programmers know the value of everything and the cost of nothing". A specific technology product never exists in a vaccuum — it has to communicate and co-exist with other components in the system. There are costs associated with every choice, often hidden costs.
An odd choice of quote, considering the author is promoting choices that costs orders of magnitude more money in the earliest stages, and inevitably provoke high migration costs when it comes time to move off those platforms.
> Cultivate a culture of ruthlessly fighting complexity
Again, an odd claim. Stacks like AWS Lambda and DynamoDB let me forget about scaling concerns* (asterisk because this is true in the early stages, slightly less true later, but still mostly true compared to traditional architecture). Those concerns absolutely rear their head when handing off to a site like Render that refuses to publish public pricing for their largest database instances, or talk about very common usecases like read replicas for analytics workloads.
> the harsh truth is that neither Lambda Functions, nor Kubernetes, nor Kafka on their own will magically make your app work correctly, be performant and deliver value.
But Redis, PostgreSQL, and PaaS-style service deployment magically will? You mean, early startup CTOs need to actually think about the architecture they propose to build to satisfy business needs? gasp
> "Why do we think this choice will provide the most value for users compared to the alternatives?"
Because serverless means not needing to hire DevOps. Because most companies running Kubernetes do not get anywhere near the ~38+% efficiency (last time I ran the numbers, and that's for production environments, not even including staging/testing/development environments) they need to make Kubernetes more cost-efficient than AWS Lambda, because developers just don't have time to figure out why the hell their services need a guaranteed vCPU, won't perform with less, and in the meantime their services are using less than 20% of the resources they requested - and they particularly don't have time to figure it out when Customer Support is happy, Product is happy, and Finance will cough up whatever budget is needed so long as Engineering says that it's "necessary". Because founders who actually think about optimizing for value, will optimize for what is scarce, and what is actually scarce is not money (plenty of money out there looking for the right investment opportunities that check all the right boxes), it is people. Serverless means hiring fewer people because you hand off undifferentiated heavy lifting.