I currently have distilled, compact Puppet code to create a hardened VM of any size on any provider that can run one more more Docker services or run directly a python backend, or serve static files. With this I create a service on a Hetzner VM in 5 minutes whether the VM has 2 cores or 48 cores and control the configuration in source controlled manifests while monitoring configuration compliance with a custom Naemon plugin. A perfectly reproducible process. The startups kids are meanwhile doing snowflakes in the cloud spending many KEUR per month to have something that is worse than what devops pioneers were able to do in 2017. And the stakeholders are paying for this ship.
I wrote a more structured opinion piece about this, called The Emperor's New clouds:
This does not happen with Puppet + Linux, because LTS distributions have a long release cycle where compatibility is not broken.
I tried to explain this topic in the article linked above. Not sure how far I succeeded.
I was pinched myself: Security.
- With the cloud threats, everything needs to be constantly up-to-date. Docker images make it easier than permanent servers that need to be upgraded. We used to upgrade every week, now we’re upgraded by default. So yes, sometimes our images don’t start with the latest version of xyz. But this is rare, downgrade is easy with Docker, and reproduction on a dev engine easier.
- With the cloud threats, everything needs to be isolated. Docker makes it easy to have an Alpine with no other executable than strictly necessary, and only open ports to the required services.
I hate the cloud because 4GB/2CPU should be way enough to run extremely large workloads, but I had to admit that convenience made me switch.
But there's a middle ground here too. To me there's a HUGE gap between Kubernetes distributed systems and shell script free for all.
it was a big reason why we moved to containers at the bare minimum, because its quick and easy to spin up and destroy and you are guaranteed what runs locally runs on prod. no more "well it worked on my system!".
The bad cloud infrastructure is when people try to use every single thing AWS sells and their whole infrastructure is at super high levels of abstraction that they could never migrate to another platform. K8s isn't that at all.
While I am a happy cloud infrastructure user in private, I have to go through some extra hoops to deploy applications at work, regardless of if k8s is used or not.
However, I ran kubeadm on a hetzner server and it's just sat chugging along forever basically. I use the cluster to run ephemeral apps where I build and deploy 1 golang service, a couple of node services in about 60 seconds ( with cache, obviously ).
As someone old enough and skilled enough to do the same with puppet, why bother when it's simpler easier that even the kids who don't understand TLS can do it with k8s?
With k8s you get a way of saying 'WHAT YOU WANT' without 'HOW TO DO IT', and this is applies not only to the actual infra aspect, but the people maintaining it too. Any cloud platform and devops worth their salt can maintain a k8s system. Good luck finding someone to understand what that 'custom Naemon' plugin is doing.
How do you control access to this setup?
How do you deploy on a different provider to Hetzner?
How do you access logs on this setup?
How do others maintain this setup?
How do you run backups?
How do you run cron jobs?
How do you deal with an offline node?
How do you expose a new ingress?
How do you provision extra storage on this setup?
If any of those is answered with 'something homegrown' or 'just write a script' then you have all the reasons k8s is worth it.
If I was to write an idempotent script for each native resource I would finish in some years :-)
You chose whatever monitoring system you like the most.
For offline nodes you use whatever the level of criticity of your node justifies. This is something people struggle to understand: not every business needs 99.99% uptime. That said, I never had a downtime in Hetzner. On Digital ocean I had one short forced reboot in 4 years. YMMV so protect yourself as much as necessary.
Deploying on a different provider than Hetzner is the same as deploying on Hetzner except the part of launching the machine which is trivial to script - the added value is making the machine work and Ubuntu/Debian/RHEL are the same everywhere. You don't have vendor lock in with this.
If K8s works for you, enjoy it. Nobody is telling you to stop :-)
- https://github.com/kube-hetzner/terraform-hcloud-kube-hetzne...
- https://www.hetzner.com/hetzner-summit --> "Managed Kubernetes Insights and lessons learned from developing our own Kubernetes platform"
You mentioned Python backend, so literally just replicate build script, directly in VPS: "pip install requirements.txt" > python main.py" > nano /etc/systemd/system/myservice.service > systemd start myservice > Tada.
You can scale instances by just throwing those commands in a bash script (build_my_app.sh) = You're new dockerfile...install on any server in xx-xxx seconds.
For Python backends I often deploy the code directly with a Puppet resource called VcsRepo which basically places a certain tag of a certain repo on a certain filesystem location. And I also package the systemd scripts for easy start/stop/restart. You can do this with other config management tools, via bash or by hand, depending on how many systems you manage.
What bothers me with your question is Pip :-) But perhaps that is off topic...?
We have a simple cloud infrastructure. Last year, we moved all our legacy apps to a Docker-based deployment (we were already using Docker for newer stuff). Nothing fancy—just basic Dockerfile and docker-compose.yml.
Advantages:
- Easy to manage: we keep a repo of docker-compose.yml files for each environment.
- Simple commands: most of the time, it’s just "docker-compose pull" and "docker-compose up."
- Our CI pipeline builds images after each commit, runs automated tests, and deploys to staging for QA to run manual tests.
- Very stable: we deploy the same images that were tested in staging. Our deployment success rate and production uptime improved significantly after the switch—even though stability wasn’t a big issue before!
- Common knowledge: everyone on our team is familiar with Docker, and it speeds up onboarding for new hires.
I have found that going all-in with certain language/framework features, such as self-contained deployments, can allow for really powerful sidestepping of this kind of operational complexity.
If I was still in a situation where I had to ensure the right combination of runtimes & frameworks are installed every time, I might be reaching for Docker too.
For example, if you have a program that uses wsgi and runs on python 2.7, and another wsgi program that runs on python 3.16, you will absolutely need 2 different web servers to run them.
You can give different ports to both, and install an nginx on port 80 with a reverse proxy. But software tends to come with a lot of assumptions that make ops hard, and they will often not like your custom setup... but they will almost certainly like a normal docker setup.
Forget about "clunky overhead" - the running costs are < 10%. The dockerfile? You don't even need one. You can just pull from the python version you want e.g. Python1.11 and git pull you files from the container to get up and running. You don't need to use container image saving systems, you don't need to save images, or tag anything, you don't need to write set up scripts in the docker file, you can pass the database credentials through the environment option when launching the container.
The problem is after a year or two you get clashes or weird stuff breaking. And modules stopping support of your python version preventing you installing new ones. Case in point, Googles AI module(needed for gemini and lots of their AI API services) only works on 3.10+. What if you started in 2021? Your python - then cutting edge - would not work anymore, it's only 3.5 years later from that release. Yeah you can use loads of curl. Good luck maintaining that for years though.
Numpy 1.19 is calling np.warnings but some other dependence is using Numpy 1.20 which removed .warnings and made it .notices or something
Your cached model routes for transformers changed default directory
You update the dependencies and it seems fine, then on a new machine you try and update them, and bam, wrong python version, you are on 3.9 and remote is 3.10, so it's all breaking.
It's also not simple in the following respect: your requirements.txt file will potentially have dependency clashes (despite running code), might take ages to install on a 4GB VM (especially if you need pytorch because some AI module that makes life 10x easier rather needlessly requires it).
life with docker is worth it. i was scared of it too, but there are three key benefits for the everyman / solodev:
- Literally docker export the running container as a .tar to install it on a new VM. That's one line and guaranteed the exact same VM, no changes. That's what you want, no risks.
- Back up is equally simple; shell script to download regular back ups. Update is simple; shell script to update git repo within the container. You can docker export it to investigate bugs without affecting the production running container, giving you an instant local dev environment as needed.
- When you inevitably need to update python you can just spin up a new VM with the same port mapping on Python 3.14 or whatever and just create an API internally to communicate, the two containers can share resources but run different python versions. How do you handle this with your solution in 4 years time?
- If you need to rapidly scale, your shell script could work fine, I'll give you that. But probably it takes 2 minutes to start on each VM. Do you want a 2 minute wait for your autoscaling? No you want a docker image / AMI that takes 5 seconds for AWS to scale up if you "hit it big".
Sorry, but you've got no idea what you're talking about.
You can also run OSI images, often called docker images directly via systemds nspawn. Because docker doesn't create an overhead by itself, its at its heart a wrapper around kernel features and iptables.
You didn't need docker for deployments, but let's not use completely made up bullshit as arguments, okay?
If you use it as IaaS, it's a lot quicker to get prototypes working than if you use anything else, including VPS's from other providers.
Google Cloud in particular has very few vectors for lock-in, and follows more principle of least surprise.
But once you have prototyped, you should ask the question about rebuilding it somewhere that is cheaper.
Near infinite scalability of disk drives is nice, and snapshotting, and cloud in general can allow you to extend your prototype into taking production load and allowing you to measure what you will need; but leaning in to "cloud magick" (cloud run, lambdas, etc) will consume almost as much time to learn and debug as just doing it the old school way anyway. In my lived experience.
The biggest problem is the so called cloud native stuff which is both more expensive and more complex. There are contexts where it makes sense but for startups they are doing more harm than good.
Two examples that I came across
- "Test" mean if it passes on CI, it is good. Failing to run test on local? Who do development on local anyway?
- Teams so reliant on "AI" because this is the future of coding. "how to sort a list in python" became a prompt, rather than a lookup on the official documentation.
I’m still very much an ansible noob, but if you have a repo with playbooks I’d love to poke around and learn some things! If not, no worries, I appreciate your time reading this comment!
While I absolutely agree with you and your approach, would you mind elaborating what kind of configuration compliance you are referring to in this statement? I suppose you do not mean any kind of configuration that your Puppet code produces as that configuration is "monitored", or rather managed, by Puppet.
This case is actually pretty simple.
Puppet applies the configuration you declare impotently when you run the Puppet agent: whatever is not configured gets configured, whatever is already configured remains the same.
If there is an error the return code of the Puppet agent is different from that of the situations above.
Knowing this you can choose triggering the Puppet agent runs remotely from a monitoring system, (instead of periodical local runs), collecting the exit code and monitoring the status of that exit code inside the monitoring system.
Therefore, instead of having an agent that runs silently leaving you logs to parse, you have a green light / red light system in regards to the compliance of a machine with its manifesto. If somebody broke the machine leaving it in an unconfigurable state or if someone broke its manifesto during configuration maintenance you will soon get a red light and the corresponding notifications.
This is active configuration management rather than what people usually call provisioning.
Of course you need an SSH connection for this execution and with that you need hardened SSH config, whitelisting, dedicated unpriviledged user for monitoring, exceptional finegrained sudo cases, etc. Not rocket science.
Sometimes the job descriptions are boastful in their reference to those technologies, and other times you can detect some level of despair.
The additional benefit is devs can run all the same stuff on a Linux laptop (or Linux VM on some other platform) - and everyone can have their own VM in the cloud if they like to demo or test stuff using all the same setup. Bootstrapping a new system is checking in their ssh key and running a shell script.
Easy to debug, not complex or expensive, and we could vertically scale it all quite a ways before needing to scale horizontally. It's not for everyone, but seed stage and earlier - totally appropriate imo.
If it interests you, both major git hosts (and possibly all of them) have and endpoint to map a username to their already registered ssh keys: https://github.com/mdaniel.keys https://gitlab.com/mdaniel.keys
It's one level of indirection away from "check in a public key" in that the user can rotate their own keys without needing git churn
Also, and I recognize this is departing quite a bit from what you were describing, ssh key leases are absolutely awesome because it addresses the offboarding scenario much better than having to reconcile evicting those same keys: https://github.com/hashicorp/vault/blob/v1.12.11/website/con... and while digging up that link I also discovered that Vault will allegedly do single-use passwords, too <https://github.com/hashicorp/vault/blob/v1.12.11/website/con...>, but since I am firmly in the "PasswordLogin no" camp, caveat emptor with that one
Both are Apache 2 and the Flatcar folks are excellent to work with
I've been running my SaaS first on a single server, then after getting product-market fit on several servers. These are bare-metal servers (Hetzner). I have no microservices, I don't deal with Kubernetes, but I do run a distributed database.
These bare-metal servers are incredibly powerful compared to virtual machines offered by cloud providers (I actually measured several years back: https://jan.rychter.com/enblog/cloud-server-cpu-performance-...).
All in all, this approach is ridiculously effective: I don't have to deal with complexity of things like Kubernetes, or with cascading system errors that inevitably happen in complex systems. I save on development time, maintenance, and on my monthly server bills.
The usual mantra is "but how do we scale" — I submit that 1) you don't know yet if you will need to scale, and 2) with those ridiculously powerful computers and reasonable design choices you can get very, very far with just 3-5 servers.
To be clear, I am not advocating that you run your business in your home closet. You still need automation (I use ansible and terraform) to manage your servers.
Did you read the article or just the headline?
Scroll down to the bottom, under the section "A few considerations" and try not to laugh.
"A few considerations" turns out to be a pretty significant chunk of security work ESPECIALLY if you are storing/transmitting highly sensitive information.
How do you handle something like HIPPA compliance when you're in this situation?
There are 2 types of programmers: those that think they've seen everything and those that know they've seen next to nothing. And as such, these absolute takes are tiring.
I'm a dev who hasn't seen anything related to that. Since you bring it up, can you give some pointers on why something like a MySQL db coupled to a monolithic backend isn't good enough? What shortcomings did you experience?
All of the things raised in the article seem possible to solve without the need for microservices.
It's when one starts getting sucked down the "cloud native" wormhole of all these niche open source systems and operators and ambassador and sidecar patterns, etc. that things go wrong. Those are for environments with many independent but interconnecting tech teams with diverse programming language use.
But I think for many "kubernetes" means your second paragraph. It doesn't have to be like that at all! People should try settling up a k3s cluster and just learn about workloads, services and ingresses. That's all you need to replace a bunch of ad hoc VMs and docker stuff.
I which there was something that did just that, because kube comes with a lot of baggage, and docker-compose is a bit too basic for some important production needs.
I can have something with nice deployments, super easy logs and metrics, and a nice developer experience setup in no time at all.
What I actually got was a half an hour tutorial from the guy who set it up, in which he explained the whole concept (I had no clue) and gave me enough information to deploy a server, which I did with zero problems. I had automatic deployment from `git push` working very quickly.
To me this seemed like a no brainer. Unless you literally have one service this is waaay easier to use.
Granted I didn't have to set it up - maybe that's where the terrible reputation comes from?
Seriously, I think a lot of people do things the hard way to learn large scale infrastructure. Another common reason is 'things will be much easier when we scale to a massive number of clients', or we can dynamically scale up on demand.
These are all valid to the people building this, just not as much to founders or professional CTOs.
Some people seem to have no concern with the needs and timetables of the would be customers but instead burn through cash building fancy nonsense.
It's like going in to a car mechanic for tires and then finding out it took 3 weeks because the guy wanted to put on low rider hydraulics and spinner hubcaps for his personal enrichment.
The worst part is it's inherently ambiguous to the next people. They don't know if the reason something is there is because it's needed or because it's just shiny bling.
I don’t quite get if people do it for interest, for love of the tech, or if they are technocratic and believe in levelling up their skill to get k8s on their CV like you say.
All I think is “this looks painful to manage”!
I run a k8s cluster at home. Part of it yes, is to apply my existing skills and keep them fresh. But part of it is that kubernetes can be easier long term.
Ive got magical hard drive storage with rook ceph. I can yoink a hard drive out of my servers and nothing happens to my workloads.
I can do maintenance on one of the servers with 0 down time.
All of my config for what I have deployed is in git.
I manage VMS and kubernetes at work, and im not going to pretend that kubernetes isnt complex, but it's complex up front instead of down the road. VMs run into complexity when things change. I'm sure you can make VMS good but then why not use something like kubernetes, you will have to reinvent a lot of the stuff that's already in kubernetes.
It's a hammer for sure and not everything is a nail, but it can be really powerful and useful even for home labs.
It's a bit like factorio with the extra dopamine hit of getting to unbox stuff.
You don't need k8s for all of that, but there's not a simpler solution than k8s that handles as much.
Life is full of pain. Deal with it.
Having seen some of these half-rolled, first-time-understood k8s deployments, and the multi-year projects to unravel the mess that was created, overflowing with anti-patterns and other incorrect ways of doing things, I think I would prefer a narrower scope of true experienced professionals (or at least some experienced pros that can help guide the ship for their mentees) working on and designing k8s infra.
And for those that don't need it (the vast majority of startups, small businesses, regular-sized businesses, etc), just stick to the easier-to-use paradigms out there.
This is a case where “things will be much easier when we scale to a massive number of clients” turned out to be true.
Should you pick a complex framework from day one? Probably not, unless your team has extensive experience with it.
My objection is towards the idea that managing infrastructure with a bespoke process and custom tooling will always be less effort to maintain than established tooling. It's the idea of stubbornly rejecting the "complexity" bogeyman, even when the process you built yourself is far from simple, and takes a lot of your time from your core product anyway.
Everyone loves the simplicity of copying over a binary to a VPS, and restarting a service. But then you want to solve configuration and secret management, have multiple servers for availability/redundancy so then you want gradual deployments, load balancing, rollbacks, etc. You probably also want some staging environment, so need to easily replicate this workflow. Then your team eventually grows and they find that it's impossible to run a prod-like environment locally. And then, and then...
You're forced to solve each new requirement with your own special approach, instead of relying on standard solutions others have figured out for you. It eventually gets to a question of sunken cost: do you want to abandon all this custom tooling you know and understand, in favor of "complexity" you don't? The difficult thing is that the more you invest in it, the harder it will be to migrate away from it.
My suggestion is: start by following practices that will make your transition to the standard tooling later easier. This means deploying with containers from day 1, adopting the 12 factors methodology, etc. And when you do start to struggle with some feature you need, switch to established tooling sooner later than later. You're likely find that your fear of the unknown was unwarranted, and you'll spend less time working on infra in the long run.
One approach that I’ve considered is to start with the standard tooling (k8s + gitops) from day one, but still run it in a single VM. Any thoughts?
If you do want to self-host, k3s could also be an option, like a sibling comment suggested. It's simpler to start with, though it still has a learning curve since it's a lightweight version of k8s. I reckon that you would still want to run at least 3 nodes for redundancy/failover, and maybe a couple more for just DB workloads. But you can certainly start with one to setup your workflow, and then scale out to more nodes as needed.
Unfortunately it's HN so people are more likely to do everything in bash scripts and say a big "fuck you" to all new hires that would have to learn their custom made mess
These are the only things I have ever been comfortable using in the cloud.
Once you get into FaaS and friends, things get really weird for me. I can't handle not having visibility into the machine running my production environment. Debugging through cloud dashboards is a shit experience. I think Microsoft's approach is closest to actually "working", but it's still really awful and I'd never touch it again.
The ideal architecture for me after 10 years is still a single VM with monolithic codebase talking to local instances of SQLite. The advent of NVMe storage has really put a kick into this one too. Backups handled by snapshotting the block storage device. Transactional durability handled by replicating WAL, if need be.
Dumbass simple. Lets me focus on the business and customer. Because they sure as hell don't care about any of this and wouldn't pay any money for it. All this code & infra is pure downside. You want as little of it as possible.
This is the most expensive way to build cloud services. When people talk about the cloud being more expensive than on-prem this is often the reason why. If you're just going to run VMs 24/7 there are better options.
You may never need to split your monolith! Stripe eventually broke some stuff out of their Rails monolith but it gets you surprisingly far.
You are not going to get easier to debug than a Django/Rails/etc monolith.
I bit of foresight on where you want to go with your infra can help you though; I built the first versions of our company as a Django Docker container running on a single VM. Deploy was a manual “docker pull; docker stop; docker start”. This setup got us surprisingly far. Docker is nice here as a way of sidestepping dependency packaging issues, this can be annoying in the early stages (eg does my server have the right C header files installed for that new db driver I installed? Setup Will be different than in your Mac!)
We eventually moved to k8s after our seed extension in response to a business need for reliability and scalability; k8s served us well all the way through series B . So the setup to have everything Dockerized made that really easy too - but we aggressively minimized complexity in the early stages.
And yet, funnily enough, the book on Monoliths says to break things up into smaller services! It says your data should be stored in its own service (possibly multiple services, if you need multi-paradigm access [e.g. relational, full-text search, etc.]). The user experience should use its own service. And, at very least, you should have another service in between (this is where Django and Rails usually fit). Optionally, it says, you will probably want to have additional services as well (auth, financial transitions, etc.)
VPS technology has come a very long way and is highly reliable. The disks on the node are set up in RAID 1 and the VM itself can be easily live migrated to another machine for node maintenance. You can take snapshots etc.
To me, I would only turn to cloud infra not for greater reliability but more for collaboration and the operational housekeeping features like IAM, secrets management, infra-as-code etc, or for datacenter compliance reasons like HIPAA.
I run a small, bootstrapped startup. We don't have enough money to pay ourselves and I make a living doing consulting on the side. Being budget and time constrained like that I have to be highly selective in what I use.
So, I love things like Google cloud. Our GCP bills are very modest. A few hundred euros per month. I would move to a cheaper provider except I can't really justify the time investment. And I do like Google's UI and tools relative to AWS, which I've used in the past.
I have no use for Kubernetes. Running an empty cluster would be more expensive than our current monthly GCP bills. And since I avoided falling into the micro-services pitfall, I have no need for it either. But I do love Docker. That makes deploying software stupidly easy. Our website is a Google storage bucket that is served via our load balancer and the Google CDN. The same load balancer routes rest calls to two vms that run our monolith. Which talk to a managed DB and managed Elasticsearch and a managed Redis. The DB and Elasticsearch are expensive. But having those managed saves a lot of time and hassle. That just about sums up everything we have. Nice and simple. And not that expensive.
I could move the whole thing to something like Hetzner and cut our bills by 50% or so. Worth doing maybe but not super urgent for me. Losing those managed services would make my life harder. I might have to go back to AWS at some point because some of our customers seem to prefer that. So, there is that as well.
Boring, but useful.
I've not seen it with Go because I haven't worked with Go in a production capacity; but I've seen C# handle thousands of RPS per node.
Thankfully. ;)
I have had great success with a very simple kube deployment:
- GKE (EKS works well but requires adding an autoscaler tool)
- Grafana + Loki + Prometheus for logs + metrics
- cert-manager for SSL
- nginx-ingress for routing
- external-dns for autosetup DNS
I manage these with helm. I might, one day, get around to using the Prometheus Operator thing, but it doesn't seem to do anything for me except add a layer of hassle.
New deployments of my software roll out nicely. If I need to scale, cut a branch for testing, I roll into a new namespace easily, with TLS autosetup, DNS autosetup, logging to GCP bucket... no problem.
I've done the "roll out an easy node and run" thing before, and I regret it, badly, because the back half of the project was wrangling all these stupid little operational things that are a helm install away on k8s.
So if you're doing a startup: roll out a nice simple k8s deployment, don't muck it up with controllers, operators, service meshes, auto cicds, gitops, etc. *KISS*.
If you're trying to spin a number of small products: just use the same cluster with different DNS.
(note: if this seems particularly appealing to you, reach out, I'm happy to talk. This is a very straightforward toolset that has lasted me years and years, and I don't anticipate having to change it much for a while)
One big advantage of the operator is that its custom resources are practically kind of standard by now. This means helm charts for a lot of software ship those and integrating that piece of software into your monitoring is a matter of setting a few flags to true. The go to solution for a k8s monitoring setup is https://github.com/prometheus-community/helm-charts/tree/mai...
But yeah, if I only wanted a thing, Ebs works.
rsyslog + knowing what the fuck you are doing is much better.
But still, no matter what, the odd customer demands they need all these complexities turned on for no discernible reason.
IMO it’s a far better approach with any platform to deploy the minimum and turn things on if you need to as you develop.
Incidentally, I’ve been exposed to “traditional” cloud platforms (Azure, GCP, AWS) through work and tried a few times to use them for personal projects in recent years and get bewildered by the number of toggles in the interface and strange (to me) paradigms. I recently tried Cloudflare Workers as a test of an idea and was surprised how simple it was.
I thought the same thing until recently. Apparently there's a "Docker Swarm version 2" around, and it was the original (version 1) Docker Swarm that was deprecated:
https://docs.docker.com/engine/swarm/
Do not confuse Docker Swarm mode with Docker Classic Swarm which is no
longer actively developed.
Haven't personally tried out the version 2 Docker Swarm yet, but it might be worth a look at. :)> Docker secrets are only available to swarm services, not to standalone containers. To use this feature, *consider adapting your container to run as a service. Stateful containers can typically run with a scale of 1 without changing the container code.*
(Emphasis mine. From https://docs.docker.com/engine/swarm/secrets/ )
I was brought in to help get a full system rewrite across the finish line. Of course the deployment story was pretty great! Lots of automated scripts to get systems running nicely, autoscaling, even a nice CI builder. The works.
After joining, I found out all of this was to the detriment of so much. Nobody was running the full frontend/backend on their machine. There was a team of 5 people but something like 10-15 services. CI was just busted when I joined, and people were constantly merging in things that broke the few tests that were present.
The killer was that because of this sort of division of labor, there'd be constant buck-passing because somebody wasn't "the person" who worked on the other service. But in an alternate universe all of that would be in the same repo. Instead, everything ended up coordinated across three engineers.
A shame, because the operational story letting me really easy swap in a pod for my own machine in the test environment was cool! But the brittleness of the overall system was too much for me. Small teams really shouldn't have fiefdoms.
Puff! Talk about microservices! Or is it macropeople?! :-)
Since it's B2B we don't need zero downtime, updates at midnight are all right.
A day before rollout they go through the staging server and the test environment, so no surprises the next morning.
Before updates, the backups kick in, so if we need to recover from a bad update we can roll back.
Sounds all 2000 and not very fancy but boring and profitable cuts for us
My ci/cd is doing a system test because everything is in containers. I can do full e2e tests and automatic rollouts without a downtime.
What i can do, can everyone else do when i'm on holiday.
How fast are you back if your server burns down tomorrow? How often have you tested that?
Are your devs waiting regularly on things?
Huh. I never said to roll one's own [hardware] infrastructure, although it even makes sense if having a GPU cluster.
1. It took the end of ZIRP era for people to realize the undue complexity of many fancy tools/frameworks. The shitshow would have continued unabated as long as cheap money was in circulation.
2. Most seasoned engineers know for the fact that any abstractions around the basic blocks like compute, storage, memory and network come with their own leaky parts. And that knowledge and wisdom helps them make the suitable trade-offs. Those who don't grok them, shoot themselves in the foot.
Anecdote on this. A small sized startup doing B2B SaaS was initially running all their workloads on cheap VPSs incurring a monthly bill of around $8K. The team of 4 engineers that managed the infrastructure cost about $10K per month. Total cost:$8K. They made a move to 'cloud native' scene to minimize costs. While the infra costs did come down to about $6K per month, the team needed new bunch of experts who added about another $5K to the team cost, making the total monthly cost $21K ($6K + $10K + $5K). That plus a dent to the developer velocity and the release velocity, along with long windows of uncertainty with regards to debugging complex stuff and challenges. The original team quit after incurring extreme fatigue and just the team cost has now gone up to about $18K per month. All in all, net loss plus undue burden.
Engineers must be tuned towards understanding the total cost of ownership over a longer period of time in relation to the real dollar value achieved. Unfortunately, that's not a quality quite commonly seen among tech-savvy engineers.
Being tech-savvy is good. Being value-savvy is way better.
On AWS, fargate containers way are more expensive than VMs and non fargate containers are kind of pointless as you have to pay for the VMs where they run anyway. Also auto scaling the containers - without making a mess - is not trivial. Thus, I'm curious. Perhaps it's Lambda? That's a different can of worms.
I'm honestly curious.
As said, most of their workloads were on cheap VPSs before. Moved some to 'scale-to-zero' solutions, reduced the bloat in VMs, fixed some buggy IaC, also moved some stuff to the serverless scene. That got a decent ~20% reduction.
You can even run the whole thing locally.
We actually just did a Show HN about it:
Do startups really need complex cloud architecture?
Inspired, I wrote a blog exploring simpler approaches and created a docker-compose template for deployment
Curious to know your thoughts on how you manage your infrastructure. How do you simplify it? How do you balance?
Rather, I have decided to opt for Supabase instead. Probably over the long time it may cause issues for my startup - but even more realistically my startup is going to fail and my increased developer velocity by using simple tooling like this will allow me to figure out why my idea doesn't work in in a shorter amount of time, so I can go on to my next pursuit.
To be honest I think even using docker is overengineering.
What I quite like about your repo:
- there is a separate API and background job instance
- there is a separate web image, to not always couple front end deployments to back end
- there are specialized data stores like Redis (or maybe RabbitMQ or MinIO in a different type of project)
- Dozzle seems nice https://dozzle.dev/ (I use Portainer mostly, but seems useful)
What I think works quite nicely in general: - starting out with a monolithic back end but making it modular with feature flags (e.g. FEATURE_REPORTS, FEATURE_EMAILS, FEATURE_API), so that you can deploy vastly different types of workloads in separate containers BUT not duplicate your data model and don't need to extract shared code libraries (yet) and if you ever need to split the codebase into multiple separate ones, then it won't be *too* hard to do that
- having a clear API (RESTful or otherwise) as the contract between a separate back end and front end deployment, so that even if your SPA technology gets deprecated (AngularJS, anyone?) then you can migrate to something, unlike when doing SSR and everything being coupled
- the same applies to NOT having the same container build process have both the front end and back end build (I've seen a Java project install a specific Node version through Maven and then the build dragging on cause Maven ends up processing thousands of files as a part of the build)
- using the right tool for the job: many might create full text search, key-value storage, message queues, JSON document storage, even blob storage all with PostgreSQL and that might be okay; others will go for separate instances of ElasticSearch, Redis, RabbitMQ, something S3 compatible and so on, probably a tradeoff between using well known libraries and tools vs building everything yourself against a single DB instance
- in my experience, many projects out there are served perfectly fine by a single server so Docker Compose feels like the logical tool to start out with, if multiple instances indeed become necessary, there is always Docker Swarm (yes, still works, very simple), Hashicorp Nomad or K3s or one of the other more manageable Kubernetes distros
- self-hosted (or self-hostable) software in general is pretty cool and gives you a bunch of freedom, though using managed cloud services will also be pleasant for many, more expensive upfront but less so in regards to your own time spent managing the stack; the former also lends itself nicely to being able to launch a local dev environment with the full stack, which feels like a superpower (being able to really test out breaking migrations, look at what happens with the whole stack etc.)
- having some APM and tracing is nice, something like Apache Skywalking was pretty simple to setup, though there are more advanced options out there (e.g. cloud version of Sentry, because good luck running that locally)
- having some uptime monitoring is also very nice, something like Uptime Kuma is just very pleasant to use
- heck, if you really wanted to, you could even self-host a mail server: https://github.com/docker-mailserver/docker-mailserver (though that can be viewed as a hobbyist thing), or have MailCatcher / Inbucket or something for development locallyFocus on product market fit (PMF) and keep things as straightforward as possible.
Create a monolith, duplicate code, use a single RDBMS, adopt proven tech instead of the “hot new framework”, etc.
The more simple the code, the easier it is to migrate/scale later on.
Unnecessary complexity is the epitome of solving a problem that doesn’t exist.
Over-time this “clean” abstraction adopts a bunch of optional parameters based on the upstream API routes, leaving you with an omni-function that is more convoluted, and thus harder to change, than if the API routes weren’t overly optimized from the get-go.
As a personal rule, I’ll let myself copy something 3 times before taking a step back and figuring out a “better” way.
Low operational costs are essential for a hardware business if you don't want to burden your customers with an ongoing subscription fee. Otherwise the business turns into some kind of pyramid scheme where you have to sell more and more units in order to keep serving your existing customers.
I have a moral obligation towards my customers to keep running even if the sales stop at some point.
So I always multiply my cost for anything with 10 years, and then decide if I am willing to bear it. If not, then i find another solution.
I sometimes wonder how many of these post boil down to "I don't want to learn k8s can I just use this thing I already know?".
My team of 6 engineers have a social app at around 1,000 DAU. The previous stack has several machines serving APIs and several machines handling different background tasks. Our tech lead is forcing everyone to move to separate Lambdas using CDK to handle each each of these tasks. The debugging, deployment, and architecting shared stacks for Lambdas is taking a toll on me -- all in the name of separation of concerns. How (or should) I push back on this?
Why did the tech lead decide to move everything to lambda when you only have 1k DAU? Can they be reasoned with or is it lambda or the highway?
You can pull put the stats and do comparison, note the wasted time, how it's not beneficial but rather detrimental. Note how long it now takes to debug for such a small codebase, then extrapolate that out.
Having tons of lambdas is a massive pain in terms of debugging. Cloud watch is not that great to debug, and the debug tooling tends to be rather expensive, like data dog so not too much is invested. Or it's too resource intensive to setup open telemetry.
But an application built in the high pressure environment of a startup also has the risk of becoming unmanageable, one or two years in. And to the extent you already have familiar tools to manage this complexity, I vote for using them. If you can divide and conquer your application complexity into a few different services, and you are already experienced in an appropriate application framework, that may not be such a bad choice. It helps focus on just one part of the application, and have multiple people work on the separate parts without stepping on each other.
I personally don't think that should include k8s. But ECS/Fargate with a simple build pipeline, all for that. "Complex" is the operative word in the article's title.
And at that point you've assembled a stack just as complex as doing it all inside a single k8s cluster.
Also, if you're fair... not all those AWS acronyms you're listing would be displaced by the single k8s cluster. (Maybe you weren't arguing to swap out complexity, rather that the complexity floodgates were open already anyway?)
For new projects that, with luck, will have a couple hundred users at the beginning it is just overkilling (and also very expensive).
My approach is usually Vercel + some AWS/Hetzner instance running the services with docker-compose inside or sometimes even just a system service that starts with the instance. That's just enough. I like to use Vercel when deploying web apps because it is free for this scale and also saves me time with continuous deployment without having to ssh into the instances, fetch the new code and restart the service.
So, you had technical scalability in one system but if the customer base grew quickly every other bottleneck would be revealed.
There is more to business operations than technology, it seems.
All businesses need to think about scalability, regardless of their size. If you're a startup, you likely want to be frugal with your infra costs, while still having the ability to quickly scale up when you need it. Those "simple" approaches everyone loves to suggest have no way of doing this.
Like all things, there's a good middle ground here-- use managed services where you can but don't over-architect features like availability & scaling. For example, Kubernetes is an heavy abstraction; make sure it's worth it. A lot of these solutions also increase dev cycles, which is not great early on.
Yes. This is the basis of privilege separation and differential rollouts. If you collapse all this down into a single server or even lambda you lose that. Once your service sees load you will want this badly.
> SQS and various background jobs backed by Lambda
Yes. This is the basis of serverless. The failure of one server is no longer a material concern to your operation. Well done you.
> Logs scattered across CloudWatch
Okay. I can't lie. CloudWatch is dogturds. There is no part of the service that is redeemable. I created a DyanmoDB table and created a library which puts log lines collected into "task records" into the table paritioned by lambda name and sorted by record time. Each lambda can configure the logging environment or use default which include a log entry expiration time. Then I created a command line utility which can query and or "tail" this table.
This work took me 3 days. It's paid off 1000x fold since I did it. You do sometimes have to roll your own out here. CloudWatch is strictly about logging cold start times now.
> Could this have been simplified to a single NodeJS container or Python Flask/FastAPI app with Redis for background tasks? Absolutely.
Could this have been simplified into something far more fragile than what is described? Absolutely. Why you'd want this is entirely beyond me.
Elsewhere in thread you say:
> The event volume is not particularly large as we tend to process things in batch and rarely on the edge of an event.
So the service is not actually under load, and it runs in batches so (temporary) failure is not actually a concern.
> This work took me 3 days. It's paid off 1000x fold since I did it.
Since Lambda was introduced less than 10 years ago, what you're saying here is that it'd be full time job for you for the past 10 years to maintain this (3000 days instead of three) if you have not gone the serverless way, which I find doubtful.
> Could this have been simplified into something far more fragile than what is described? Absolutely.
Considering the hyperboles in the rest of your comment, this sounds more like snark than a considered opinion.
Your dynamodb solution isn't foolproof. It has throughput limited to the partition granularity -> in your case the lambda name. It's also relatively expensive and fairly slow to query in bulk (DDB is designed for OLTP).
I don't have direct experience here, but I expect slapping grafana on top of any disk basked source is likely to be cheaper, faster, and have better ergonomics. Once your logging is too much for a disk to handle (this will be later than you would've outgrown ddb, but before you would've outgrown cloudwatch) then you can bring something fancy in.
The event volume is not particularly large as we tend to process things in batch and rarely on the edge of an event. I also wouldn't, for example, log API requests using this mechanism. We're nowhere near this being an issue as 20-30 lambdas is not a particular problem for us. Choose a good naming convention and build your own deployment infrastructure and it's no sweat.
> relatively expensive
Large object compression and/or offload to s3 is baked into our dynamodb interface library. Not that this matters as almost all log records end up being less than 4kb anyways.
> slow to query in bulk
Which is why time is part of the key. You're not often looking back more than an hour. There's bulk export back onto campus servers if you wanted that anyways. TTL is default 1 day. Running a "tail" is absurdly cheap, much cheaper than CloudWatch's laughable rate for their similar feature, a miss is 1/2 a read unit, and a hit is almost never more than 2.
> slapping grafana
I didn't need "observability." I need current state and recent deltas. This is particularly true when any changes are made. Otherwise my logs are pure annoyance and don't generally provide value. We optimized for the exceptionally narrow case we felt the cloud underserved in and left it at that.
What relation does any of those have with load?
(And also, why are people so kin on doing privilege separation by giving full privilege to a 3rd party and asking it to limit what each piece of code can do?)
Downside is its a one to one system. But I just use downsized servers.
Bare minimum, script out the install of your product on a fresh EC2 instance from a stock (and up-to-date) base image, and use that for every new deploy.
We run Spacelift workers with Auto Scaling Groups and pick up their new image ~monthly with zero hassle since everything is automated.
Raw EC2 is just part of the story...
Edit to add: I also recommend using Amazon Linux unless you _have_ to have RHEL / Cent / Rocky or Ubuntu. Just lean into the ecosystem and you can get so many great features (and yes, I ACK the vendor lock-in with this advice). A really cool feature is the ability to just flip on various AWS services like the systems manager session manager and get SSH without opening ports a-la wireguard.
obviously, it's not cloud-native... but if you are using AWS EC2 it works
Scaling (and relatedly, high availability) are premature optimizations[0] implemented (and authorized) by people hoping for that sweet hockey stick growth, cargo culting practices needed by companies several orders of magnitude larger.
[0] https://blog.senko.net/high-availability-is-premature-optimi...
I recently attempted to move to a completely static site (just plain HTML/CSS/JS) on Cloudflare Pages, that was previously on a cheap shared webhost.
Getting security headers setup, and forcing ssl, and www - as well as HSTS has been a nightmare (and still now working).
When on my shared host, this was like 10 lines of config in an .htaccess file before.
Let’s say I’ve got a golang binary locally on my machine, or as an output of github actions.
With Google Cloud Run/Fargate/DigitalOcean I can click about 5 buttons, push a docker image and I’m done, with auto updates, roll backs, logging access from my phone, all straight out of the box, for about $30/mo.
My understanding with Hetzner and co is that I need to SSH (now i need to keep ssh keys secure and manage access to them) in for updates, logs, etc. I need to handle draining connections from the old app to the new one. I need to either manage https in my app, or run behind a reverse proxy that does tls termination, which I need to manage the ssl certs for myself. This is all stuff that gets in the way of the fact that I just want to write my services and be done with it. Azure will literally install a GitHub actions workflow that will autodeploy to azure container apps for you, with scoped credentials.
Depends on your defintion of simplest. In terms of set-up probably someting like https://dokku.com/ . It's a simple self-hosted version of herokku, you can be up and running in literally minutes and because its compatable with herokku you can re-use lots of github action/ other build scripts.
In terms of simple (low complexity and small sized components) just install caddy as your reverse-proxy which will do ssl certs and reverse proxy for you with extremely little, if any config. Then just have your github action push your containers there using whatever container set-up you prefer. This is usually a simple script on your build process like "build container -> push container to registry -> tell machine to get new image and run it" or even simpler just have your server check for updated images routinely if you don't want to handle communication between build script and server. That's the bare minimum needed. This takes a bit longer than a few minutes but you can still be done within an hour or two.
Regardless of your choice it shouldn't take more than 1 working day, and will save you a lot of money compared to the big cloud providers. You can run as low as €4.51/month with hetzner and that includes a static IP and basically unlimited traffic. An EC2 instance with the same hardware costs about $23 a month for comparison (yes shared vs dedicated vCPU, but even the dedicated offer at hetzner is cheaper, and this is compared to a serverless set-up where loads are spikey, which is exactly how we can benefit from a shared vCPU situation).
[1] https://keepassxc.org/docs/KeePassXC_UserGuide#_setting_up_s...
If you don't like ssh you can have a gitlab runner on your VM which will redeploy your stuff on git push / git tag / whatever you want
This will most certainly cost you more than $30, but you can do it.
It does automatic certificates.
Everyone is building like they are the next Facebook or Google. To be honest, if you get to that point, you will have the money to rebuild the environment. But, a startup should go with simple. I miss the days when RAILS was king just for this reason.
The added complexity is overkill. Just keep it simple. Simple to deploy, simple to maintain, simple to test, etc. Sounds silly, but in the long run, it works.
In my time at my current job we've scaled PHP MySQL and Redis from a couple hundred active users to several hundred-thousand concurrent users.
EC2+ELB, RDS (Aurora, Elasticache). Shell script to build a release tarball. Shell script to deploy it. Everyone goes home on time. In my 12+ years I've only had to work off hours twice.
People really love adding needless complexity in my experience.
No, people love thinking their experience is the same as everyone else's.
Have you ever worked in healthcare? Do you have any idea what sort of requirements there are for storing sensitive information?
>n my 12+ years I've only had to work off hours twice.
Well that settles it. Then no one on the planet should need cloud infra if yuo didnt.
And please, please don't tell me you've spent the last 12 years at the same place and have the gall to extend that to all software development.
I like the cloud but it is overused and misused a lot imo.
Postgres for everything including queuing
Golang or nodejs/TypeScript for the web server
Raw SQL to talk to Postgres
Caddy as web server with automatic https certificates
- No docker.
- No k8s.
- No rabbitmq.
- No redis
- No cloud functions or lambdas.
- No ORM.
- No Rails slowing things down.
Your little startup will become large, and fast.
That hacked together single server is going to bite you way sooner than you think, and the next thing you know you’ll be wasting engineer hours migrating to something else.
Me personally, I’d rather just get it right the first time. And to be honest, all the cloud services out there have turned a complex cloud infrastructure into a quick and easy managed service or two.
E.g., why am I managing a single VPS server when I can manage zero servers with Fargate and spend a few extra bucks per month?
A single server with some basic stuff is great for micro-SaaS or small business type of stuff where frugality is very important. But if we shift the conversations to startups, things change fast.
Part of the reason they weren’t successful was because my managers insist on starting with microservices.
Starting with microservices prevents teams from finding product-market fit that would justify microservices.
What would be the simple yet robust infra for data eng? Not thought a lot about it for now, so I am curious if some of you have would have any insights.
In the past years I was solving a data pipeline mess on a project which also had a devops AWS mess. First thing I was told was "what we need is a data lake".
Decisions are sticky so take context into account.
- A lot of companies and startups can get by with a few modest sized VPSs for their applications
- Cloud providers and other infrastructure managed services can provide a lot value that justifies paying for them.
Want to run bare metal? OK, guess you're running your databases on bare metal. Do you have the DBA skills to do so? I would wager that an astounding number of founders who find themselves intrigued by the low cost of bare metal do not, in fact, have the necessary DBA skills. They just roll the dice on yet another risk in an already highly risky venture.
Faulty input killing your logic - I saw this plenty, would Lambda really help here?
[0] https://kamal-deploy.org [1] https://kamalmanual.com/handbook
We’ve tried deploying services on K8s, Lambda/Cloud Run, but in the end, the complexity just didn’t make sense.
I’m sure we could get better performance running our own Compute/EC2 instances, but then we need to manage that.
In reality, there is a strong bias in favor of complex cloud infrastructure:
"We are a modern, native cloud company"
"More people mean (startup/manager/...) is more important"
"Needing an architect for the cloud first CRUD app means higher bills for customers"
"Resume driven development"
"Hype driven development"
... in a real sense, nearly everyone involved benefits from complex cloud infrastructure, where from a technical POV MySQL and PHP/Python/Ruby/Java are the correct choice.
One of the many reasons more senior developers who care for their craft burn out in this field.
One domain, an idea, an easy-to-use development stack for a bootstrapped as well as funded startup is more than good enough to locate product-market fit.
Alway remember this quote by Reid Hoffman “If you are not embarrassed by the first version of your product, you’ve launched too late.”
It's simple and can scale to complex if you want. I've had very good experience with it in medium size TS monorepos.
[0]: https://sst.dev
From what I understand he employees a dedicated system administrator to manage his fleet of VPS (updates, security and other issues that arise) for 1000s of USD per month.
Costs in my case is not the highest priority: I can spend a month learning the ins and outs of a new tool, or can spend a few days learning the basics and host a managed version on a cloud provider. The cloud costs for applications at my scale are basically nothing compared to developer costs and time. In combination with LLMs who know a lot about the APIs of the large cloud providers, this allows me to focus on building a product instead of maintenance.
Operating a bunch of simple low-level infrastructure yourself is not simpler than buying the capabilities off the shelf.
I'd say it is more like: Using a trolley to move some stuff across the street is more simple than using a fleet of drones.
So it's a risky thing to brag about right now.
Check:
Man the infrastructure was absolutely massive and so much development effort went into it.
They should have had a single server, a backup server, Caddy, Postgres and nodejs/typescript, and used their development effort getting the application written instead of futzing with AWS ad infinitum and burning money.
But that's the way it is these days - startup founders raise money, find someone to build it and that someone always goes hard on the full AWS shebang and before you know it you spend most of your time programming the machine and not the application and the damn thing has become so complex it takes months to work out what the heck is going on inside the layers of accounts and IAM and policies and hundreds of lambda functions and weird crap.
I built out a POC and was running it on bare metal for serious workloads under my desk at GE (12-factor). Management practically scrambled to get me cloud access. My setup was ephemeral and could be easily reproduced anywhere. The software was easily deployed on, or integrated with, cloud services. I just shrugged.
I didn't care where my code ran, to them it was some epic priority to get it in the cloud and generate extra expenses.
It’s really brilliant. Sun would have been the one to buy Oracle if they’d figured out how to monetize FactorySingletonFactoryBean by charging by the compute hour and byte transferred for each module of that. That’s what cloud has figured out, and it’s easy to get developers to cargo cult complexity.
And S3. S3 is just a wonderful beast. It's just so hard to get something that is so cheap yet offers practically unlimited bandwidth and worry-free durability. I'd venture to say that it's so successful that we don't have an open-source alternative that matches S3. By that I specifically mean that no open-source solution can truly take advantage of scale: adding a machine will make the entire system more performant, more resilient, more reliable, and cheaper per unit cost. HDFS can't do that because of its limitation on name nodes. Ceph can't do that because of it bottleneck on managing OSD metadata and RGW indices. MinIO can't do that because their hash-based data placement simply can't scale indefinitely, let alone ListObjects and GetObjects will have poll all the servers. StorJ can't do that because their satellite and metadata servers are still the bottleneck, and the list can go on.
Of cource it highly depends on the skills of the team. In a startup there could be no time to learn how to do infrastructure well. But having an infrastructure expert in the team can significantly improve the time to market and reduce the developer burnout and the tech debt growth rate.
You shouldn't need to assemble a plane when your startups journey can be expected to only last a few kilometers and you really only need to carry a few boxes.
The result: It's still up after 5 years. I never looked back after I created the project. I do remember the endless other projects I did that have simply died now because I don't have time to maintain a server. And a server almost always end up crashing somehow.
Another thing, Pieter Levels has successful small apps that relies more on centralized audiences than infrastructure. He makes cool money but it's nowhere near startup-expected levels of money/cash/valuations. He is successful in the indie game but it'll be a mistake to extrapolate that to the VC/Silicon Valley startup game.
1. New technology is bad and has no merit other than for resume.
2. Use old technology that I am comfortable with.
3. Insist that everyone should use old technology.
Really? EC2 instances are waaay overpriced. If you need a specific machibe for a relatively short time, sure, you can pick up one from the vast choice of available configurations, but if you need on for long-running workloads, you'll be much better of picking up one from Hetzner, by an order of magnitude.
For one of the many examples, see this 5-year old summary (even more true today) by a CEO of a hardware startup:
https://jan.rychter.com/enblog/cloud-server-cpu-performance-...
Most of crap hitting servers is old exploits targeting popular CMS.
WAF is useful if you have to filter out traffic and you don’t know what might be exposed on your infra. Like that Wordpress blog that marketing set up 3 years ago and stopped adding posts and no one ever updated it.
vulnerability scanning of your images.
Fargate
RDS
Yeah, that won't scale to a million QPS, or even 10 QPS. But way more businesses fail because they never achieve 100 Queries Per Day, instead of failing because they fell over at 10 or 1,000 or 1,000,000 QPS.
I mean, hell, Twitter (back in the day) was famous for The Fail Whale.
Getting enough traffic is harder and more important than your "web scale architecture" for your startup. Making actual cash money off your traffic is harder and more important than your "web scale architecture" (ideally by selling them something they want, but making cash money through advertising or by impressing VCs with stories of growth and future value counts too).
There is precisely _zero_ chance that if you ever get within 2 or 3 orders of magnitude of "a million QPS" - that the code you and your cofounder wrote won't have been completely thrown away and rewritten by the 20 or 100 person engineering department that is now supporting your "1000 QPS" business.
I need it on my resume for every 2 year stint and 2-3 people on the team to vouch for it
You’re saying “hey, let everyone know you worked on a tiny company’s low traffic product and how about you just don’t make half a million a year," all to save the company I work at a little money?
until companies start interviewing for that its a dumb idea, I’m rarely making green field projects anywhere and other devs also are looking for maintainers of complex infrastructure
99.99% of the time. No.
For many of the "complex" things like lambdas there are frameworks like Serverless that makes managing and deploying it as easy (if not easier frankly) than static code on a VM.
Not every workload also scales at the same time, we have seen new things that got very successful and crashed right out the door because it could not properly scale up.
I agree that you don't need an over engineered "perfect" infrastructure, but just saying stick it on a VM also seems like it is too far of a swing in the other direction.
That ignores the cost side of running several VM's vs the cost of smaller containers or lambdas that only run when there is actual use.
Plus there is something to be said about easier local development which some things like Serverless and containers give you.
You may not need to setup a full k8s cluster, but if you are going with containers why would you run your own servers vs sticking the container in something managed like ECS.
> But here's the truth: not every project needs Kubernetes, complex distributed systems, or auto-scaling from day one. Simple infrastructure can often suffice,
When the hell can we be done with these self compromising losers? Holy shit! Enough! It doesn't save you anything doing less. People fucking flock to bot-Kubernetes because they can't hack it, because they suck, because they would prefer growing their own far worse far more unruly monster. A monster no one will ever criticize in public because it'll be some bespoke frivolous home grown alt-stack no one will bother to write a single paragraph on, which no one joining will grok understand or enjoy.
It's just so dumb. Theres all these fools trying to say, oh my gosh, the emperor has no clothes! Oh my gosh! It might not be needed! But the alternative is a really running naked through the woods yourself, inventing entirely novel unpracticed & probably vastly worst less good means for yourself. I don't know why we keep entertaining & giving positions of privilege to such shit throwing pointless "you might not need it" scum sucking shits trying to ruin things like so, but never ever do they have positive plans and never ever do they acknowledge that what they are advocating is to take TNT to what everyone else is trying to practice, is collaborating on. Going it alone & DIY'ing your own novel "you might not need" to participate in a society stack is fucking stupid & these people don't have the self respect to face up to the tall dissent they're calling for. You'd have to be a fool to think you are winning by DIY'ing "less". Fucking travesty.
Amen.
> I don't know why we keep entertaining & giving positions of privilege to such shit throwing pointless "you might not need it" scum sucking shits t
Amen.