Why I recommended ECS instead of Kubernetes to my latest customer (opens in new tab)

(leanercloud.beehiiv.com)

88 pointsalien_3y ago123 comments

123 comments

85 comments · 19 top-level

itsmemattchung3y ago· 30 in thread

> When looking at the cloud resources, we noticed many On-Demand EC2 instances with relatively low CPU utilization, which can be expected considering they don't have customers yet.

As a software consultant myself, I'd probably stop the conversation right there and ask why they are building such a robust distributed system — SQS, SNS, etc — without any customers. Still want to be deployed in AWS? Toss the damn app on a single EC2 instance...

ljm3y ago

I’ve been exploring this lately because, honestly, the cloud is total overkill for small startups and hobby projects.

Kubernetes has its value even for small scale workloads like that, but it’s still a few steps more than, say, running a Capistrano script to push your code to a small Linux box with a database on a second one.

You’ll get really far on minimal resources these days, especially with cheaper ARM boxes that offer far more bang for your buck. Paying 1k+ a month to AWS/GCP/Azure is total insanity when you’re not even averaging a single active user a day.

clvx3y ago

At the beginning, just for the development experience I would just put an instance in some cloud provider and use microk8s or k3s to serve the app. It's very straightforward and then you can move to a managed service if needed. You will probably be using the same tooling and integrations at different steps. Context switching is low and you can reproduce locally. I'm down for serverless options when needed but I have a strong preference for local development.

CharlesW3y ago

> …the cloud is total overkill for small startups and hobby projects.

It absolutely can be, sure. But solutions like Vercel, Cloudflare Workers, Supabase, etc. can be excellent and inexpensive for those use cases.

3 more replies

TheNewsIsHere3y ago

It’s not the first time I’ve written about this. The hyperscalers are pretty much the most expensive way to build a business that isn’t presently hyperscale, and their ecosystems are increasingly optimized for sprawling stacks built on a virtually unlimited number of microservices.

That’s just not a realistic or necessary approach for everyone.

AWS is engineered for excruciatingly detailed billing right down to the moment you’re consuming or releasing capacity, and that’s how they built it. Managing that spend is exhausting.

My business runs on under $200/mo in Linode compute resources and the performance is significantly better than on similarly situated EC2 instances. We were spending that on databases alone with AWS and getting a fraction of the performance.

I make extensive use of “pure” Linode Kubernetes Engine k8s. It’s portable to any other Kubernetes cluster, and it lets me take my stack _anywhere_, even to a rack in the nearest data center willing to rent me space, if I really wanted.

ecshafer3y ago

With so many developers I feel that there is a complete lack of familiarity with what it takes to just run a website. So many came up in the land of cloud and k8s and etc. There are use cases for these more advanced production environments. But if more developers just learned how to make a website on linux, with a db, a webserver, and an application. They would know that a lot of more complex things just aren't needed... especially when you don't even have customers.

bamfly3y ago

Truly, a very small number of real servers, just enough for blue/green deployments and so you can stay up if any one server goes offline, meets any plausible needs for a really, really high percentage of businesses & products. A ton of early-stage ones can get away with skipping most of that and just run on one or two servers, period, for quite a while.

If you're outsourcing operations to AWS or whomever, a couple largish instances and a couple supporting services can get you pretty much that same thing, for a bit more money and a bit less control over performance-consistency.

All that HA/scaling/clustering/cloud stuff is expensive, not just in monetary terms, but in performance terms. If you don't actually need it, a high percentage of your compute & (especially) your network traffic may be going to that, rather than actually serving the product. It also adds a hell of a lot of complexity, which comes at a significant time-cost for development, unless you want your defect rate to shoot up.

> But if more developers just learned how to make a website on linux, with a db, a webserver, and an application.

And hell, nothing's stopping you from writing 12-factor apps and deploying containers, and scripting your server set-up and config, even if you don't go straight for heavy, "scalable" architecture. Even if your server's a beige Linux box in a closet. Enough benefits that the effort's probably a wash at worst (hey, documentation you can execute is the best documentation!) even if you never need to switch architectures, and then you'll have a relatively easy time of it, if you do end up needing to.

1 more reply

sigstoat3y ago

i had a client who was burning… $10k? maybe $20k per month largely on nodes for EKS when they had no paying customers and ~zero load. (they had fully “production” sized clusters in all of their environments, and they had a slew of weird not-quite-prod environments.)

they also had some rabbitmq-on-k8s system going that fell over during small tests because they couldn’t get k8s to actually scale it. (which then convinced them they needed k8s, and bigger nodes)

sigh

interroboink3y ago

The promise of cloud infrastructure is that it can scale to fit demand — start small, and grow as needed. But sometimes the truth is that it just lets people spend money more easily (:

Back in the day, it would have required a whole procedure to buy that hardware, have it set up, etc. Now you can needlessly spend $10k per month with just a few clicks!

1 more reply

waffletower3y ago

And in that case, why ec2, why not a more affordable provider?

grogenaut3y ago

Because I already have an AWS account that bills directly to my credit card along with some other stuff that I'm already paying for. Every time I go down the let me save money route I spend hours reading through CD website reviews for hosting providers without any real understanding of their quality to save a few dollars and end up burning tens of hours of time. Or I could just fire the fucking thing up on AWS and then turn it off if I decide not to work on the project further

jmholla3y ago

Who would you recommend as a more affordable provider?

1 more reply

mywittyname3y ago

There's a lot of expertise in AWS-land.

2 more replies

alien_OP3y ago

The OP here, thanks for your comment.

To be honest I wasn't hired to challenge their entire setup, only to make it more cost effective.

So I chose the most straightforward way I could think of that would allow us to come up with a cost effective setup that will be scalable, fault tolerant and simple to maintain later on.

It all probably started with such a single instance running Docker compose, but then over time it evolved into this setup.

The ideal setup I mentioned would have been also cost effective, scalable and resilient.

politician3y ago

I recently spoke with some folks who declined to invest because our solution was too simple: specifically, the fact that we don't use Kubernetes was a negative signal.

That's baffling to me, but that perspective is out there too.

nemothekid3y ago

>ask why they are building such a robust distributed system — SQS, SNS, etc — without any customers

I think this is one of those things that really depends on the use case. If they are performing expensive inference, I think having any queue is better than no queue. Going from a synchronous system to an asynchronous one is not easy and it's not something you would want anyone to be paged for once it starts to matter. Getting SQS/SNS up and running now could be a couple hours of work today and is practically free if your traffic is low.

Similarly I have a number of side projects that run extremely cheaply just using ECS and Fargate. I don't even think about Kubernetes really, it's just a PaaS to me that I'm shipping ARM binaries to. As a result I don't think very hard about autoscaling, failover, load balancing or deployment. A github action just pushes master to ec2 and everything "just works".

HatchedLake7213y ago

What SQS has to do with EC2?

One is a queuing service, the other one is a VM.

So instead of using SQS that has $0 cost when there are no customers, you suggest I install, configure and run RabbitMQ on an EC2, to save $0 when there are no customers?

Or save $1 when I have 100 customers? SQS is dirt cheap.

The point of SQS or any other usage-based AWS _developer_ service compared to DIY is that you can be up and running in minutes at a minuscule cost.

I agree with you about over-engineering and building a distributed "microservices" architecture when you have no customers.

But I'll pick SQS any time of the day when I need queueing functionality to increase my developer velocity so I can focus on building value rather than wasting my life installing, configuring and running anything on EC2.

Nextgrid3y ago

The AMQP protocol alone and its various, good client libraries (compared to terrible AWS SDK which is a very thin abstraction over just sending/parsing raw JSON off the wire) is by itself enough to justify RabbitMQ.

> when I need queueing functionality to increase my developer velocity so I can focus on building value rather than wasting my life installing, configuring and running anything on EC2.

SQS still requires configuration, which means you either need to use the (terrible) AWS console UI or spin up a whole Terraform/CloudFormation/CDK/etc stack, not to mention that merely connecting to it requires correctly setting up AWS IAM (so you don't use a key that gives access to your entire AWS account). Vim'ing the RabbitMQ config file in contrast doesn't seem so bad, and even just using a static hardcoded password means the worst an attacker can do is take down your queue instead of taking over your entire cloud infra.

djbusby3y ago

The question is: what are queueing for zero customers?

2 more replies

RcouF1uZ4gsC3y ago

A single EC2 instance with SQLite as the database can get you pretty far.

renewiltord3y ago

Yeah that's a good starting point. Maybe just docker on those when you have two apps so they don't step on each other.

whatever13y ago

Exactly. Worry about scaling when scaling is in the horizon

x86x873y ago

No no no. We want to be like Google. Web Scale. Big big data. Huuge

taeric3y ago

It is rather amusing how over engineered most seed projects have a tendency to be.

I do think ddb and lambda hit a sweet spot for costs on ramping up. The rest, though, really struggle.

mlhpdx3y ago

For me, setting up connections between SQS, SNS, DDB, Lambda, step functions, S3, Route53, API Gateway in CloudFormation is just a muscle memory. I’m much faster at it at this point that I am at standing up an EC2. I agree it can be hard to learn, but it certainly isn’t hard to do.

Elsewhere in the comments, there’s a suggestion that this kind of thing isn’t appropriate for “hobby projects” and early stage but I disagree. Those are the times when you really want something you can step away from without doing a disservice to your customers (i.e. letting packages go out of date and get vulnerable) and cost you as little as possible in a steady state so you can focus on acquiring customers and not worrying about fuddling around with the guts.

1 more reply

0xEFF3y ago

A single EC2 instance is an equally bad trade-off on the opposite side of the spectrum from over architected SQS, SNS, etc…

The ideal trade off is a single Kubernetes cluster with as much in the cluster as makes sense for the team and stage of the project. As you say, toss the app on a single node to start, but the control plane is tremendously valuable from on the onset of most projects.

intelthrow63y ago

I don’t see the reasoning.

A startup that outgrows an EC2 server will be making enough money to hire more people to scale the system properly than what was initially designed: trading away everything for development velocity.

Kubernetes is not the right tool for this startup. Kubernetes is what large, old-school non-tech companies use to orchestrate resources, because it’s easier to find someone that “knows k8s” (no one knows k8s unless they’re consulting) than it is to find someone that can build properly distributed systems (in the eyes of whoever is in charge of hiring).

3 more replies

dangus3y ago

Why would you plan not to have customers? Don't you think the company is able to forecast demand for a new product launch?

Disney: We'd like to launch a new streaming service.

Consultant: Great! You have no customers right now so you can run it on a singleton EC2 instance until you outgrow that scale!

Disney: ...We expect 20 million people to sign up in the first week

Dylan168073y ago

> Don't you think the company is able to forecast demand for a new product launch?

I'm pretty sure "follow the forecast" is exactly what motivated that post.

In other words, the infrastructure is overkill for the initial forecast of customers.

They're not working for Disney.

1 more reply

justrealist3y ago

SQS and SNS are a perfectly good primitives for building a robust distributed system that costs $0 when not in use, by triggering compute via Lambda or Batch.

Your comment is really pretty ignorant of how these tools interact. Using serverless primitives is the opposite of leaving nodes running for no reason.

boredumb3y ago

This is 100% the line of questioning to pursue.

danpalmer3y ago· 8 in thread

More accurately: "Given using AWS as a requirement, I recommended ECS instead of K8s".

It's not really surprising that AWS's K8S setup isn't great, and their own implementation ties in more closely with other services they offer. It's lock-in. AWS provides just enough K8S to tick the box on a spec sheet, but have little incentive to go beyond that.

TurningCanadian3y ago

The nice thing about a standard like K8S is how there are other clients to choose from.

You can do everything from the CLI with kubectl of course, but there are also a bunch of apps that will work with any K8S cluster:

https://medium.com/dictcp/kubernetes-gui-clients-in-2020-kub...

It's very nice to have a consistent interface across multiple cloud providers.

jorams3y ago

Since the post only mentions Kubernetes once, I don't really understand why it's in the title at all.

> The team didn't have much DevOps expertise in-house, so a Kubernetes setup, even using a managed service like EKS, would have been way too complex for them at this stage, not to mention the additional costs of running the control plane which they wanted to avoid.

The control plane cost makes sense, but I can't imagine learning Terraform to set up ECS is that much easier than learning Yaml to configure k8s. Unless EKS is much harder to use than GKE.

danpalmer3y ago

Also the control plane cost is basically irrelevant at any real scale, I think it's pretty much there to discourage hobby projects from taking up a free VM.

benced3y ago

It is - EKS had fewer features.

fnordpiglet3y ago

As someone who sat in on the product development discussions at aws about EKS the internal view was K8S was: * a lock in strategy by Google to substitute for the fact they don’t yet have systemic abstractions at a provider level. By “owning” the design and engineering around k8s through capture they can ensure the backing services they build in gcp naturally support k8s users as they develop their roadmap * the providing of customer space SDN and infrastructure services via a OS/user space runtime was seriously weaker than what an infrastructure provider can offer in terms of stability, durability, security, audit, etc. * the complexity of running an abstraction layer on top of an abstraction layer that provide essentially identical or similar services was crazy * the semantics of durable stores (queues, databases, object stores, etc) would never be sufficient in a k8s model compared to a hosted provider service * the bridging of k8s into provider durable stores breaks the run anywhere model as even stores like s3 that have similar APIs across providers have vastly different semantic behaviors * as such, k8s solved mostly stateless problems, which being trivial doesn’t merit the complexity of k8s * k8s wasn’t a standard, that requires standardization and a standards body. K8s was a popular solution to a problem that has had many solutions. Google promoting it and investing in it didn’t make it a standard, nor did the passion of the k8s community. * that said for customers in data center installs would benefit from the software defined infrastructure and isolation as most data center installs are giant undifferentiated flat prod blobs of badness. The same could be said for all the various similar solutions to k8s though and it wasn’t obvious why k8s was the “right” choice beyond the hype cycle at that time.

Eventually EKS was built to satisfy customers that insisted these issues were just FUD from aws to lock customers into the aws infrastructure. However what I have seen since is a basic progression of: customer uses k8s on prem, is fanatical about its use. They try to use it in aws and it’s about as successful as on prem. Their peers squint at it and say “but wouldn’t this be easier with ECS/fargate?” K8s folks lose their influence and a migration happens to ECS. I’ve seen this happen inside aws working with customers and in three megacorps I’ve worked on cloud strategies for. I’ve yet to encounter a counter example, and this was sort of what Andy predicted at the time. I’m not saying there aren’t counter examples, or that this isn’t a conspiracy against k8s to get your dollars locked into aws.

On standards Andy always said that at some point cloud stuff would converge into a standards process but at the moment too little is known about patterns that work for standards to be practical. Any company launching into standards this early would get bogged down and open the door to their competitors innovating around them and setting the future standard once the time is right for it. Obviously not an unbiased viewpoint, but a view that’s fairly canonical at Amazon.

nova220333y ago

Eventually EKS was built to satisfy customers that insisted these issues were just FUD from aws to lock customers into the aws infrastructure.

I mean..the customers are not wrong.

1 more reply

dilyevsky3y ago

As we can now tell k8s was absolutely genius play bc one option is aws embraces it reducing their competitive edge in service offerings (at the time) which didn’t happen or doesn't do it at all/half-asses it (which is what happened) while the platform gains popularity which also reduces their edge

cmbothwell3y ago

Not sure why you’re being downvoted, this is a very interesting history.

lapser3y ago· 8 in thread

I find it really wild that anyone would ever recommend ECS. A developer deploying a service involves:

- Setting up certs (managed as TF) - Setting up ALBs (managed as TF) - Setting up the actual service definition (often done as a JSON, that is passed into TF)

Possibly other things I'm forgetting.

Some other things. It requires a *developer* to know about certs and ALBs and whatever else.

With EKS, this can all be automated. The devops engineer can set it up so that deploying a service automatically sets up certs, LBs etc. Why are we removing such good abstractions for a proprietary system that is *supposed* to be less management overheads, when in reality, it causes devs to do so much more, and understand so much more?

tedivm3y ago

I honestly don't understand where you're coming from. If a devops engineer can set things up on eks for people to launch without thinking of those things, what's stopping that same engineer from doing similar for ecs?

When I was at Rad AI we went with ECS. I made a terraform module that handled literally everything you're talking about, and developers were able to use that to launch to ECS without even having to think about it. Developers literally launched things in minutes after that, and they didn't have to think about any of those underlying resources.

jpgvm3y ago

Handing Terraform to developers has it's own host of issues.

A major benefit of k8s that is usually massively overlooked is it's RBAC system and specifically how nice a namespace per team or per service model can be.

It's probably not something a lot of people think about until they need to handle compliance and controls for SOC II and friends but as someone that has done many such audits it's always been great to be able to simply show exactly how can do what on which service in which environment in a completely declarative way.

You can try achieve the same things with AWS IAM but the sheer complexity of it makes it hard to sell to auditors which have come to associate "Terraform == god powers" and convincing them that you have locked it down enough to safely hand to app teams is... tiresome.

3 more replies

alien_OP3y ago

The OP here, thanks for the comment!

Why does the developer need to care about the certs and ALBs? The devops engineer you need to set up all those controllers could as well deploy those resources from Terraform.

As I showed in the diagrams from the article this application has a single ALB and a single cert per environment and the internal services only talk to each other through the rabbit MQ queue.

DNS, ALB and TLS certs could be easily handled from just a few lines of Terraform, and nobody needs to touch it ever again.

With EKS you would need multiple controllers and multiple annotations controlling them, and then each controller will end up setting up a single resource per environment.

The controllers make sense if you have a ton of distinct applications sharing the same clusters, but this is not the case here, and would be overkill.

lapser3y ago

> DNS, ALB and TLS certs could be easily handled from just a few lines of Terraform, and nobody needs to touch it ever again.

Welcome to reality, where this is not the case.

I'm currently working at a company where we're using TF and ECS, and app specific infra is supposedly owned by the service developers.

In reality, what happens is devs write up some janky terraform, potentially using the modules we provide, and then when something goes wrong, they come to us cos they accidentally messed around with the state or whatever. DNS records change. ALB listener rules need to change.

NovemberWhiskey3y ago

That seems a strange way to look at things to me. If you're going to give credit for things that a devops engineer can do inside the Kubernetes platform, why not given equivalent credit for what a devops engineer can do with a Terraform module that would achieve substantially similar levels of automation and integration with ECS?

moduspol3y ago

Also weird to leave out which things are versioned things that must be installed, maintained, and upgraded by you (e.g. cert-manager, an ALB controller, the Kubernetes control plane) that do not apply to a Terraform (or CloudFormation)-based deployment to ECS.

honkycat3y ago

Agreed. After recently finishing up a migration off ECS, it is madness and feels like OP just wanted a contrarian take.

Honestly, if they had said: "So instead we set up some bare-metal EC2 instances" I would be on-board.

alien_OP3y ago

The OP here, thanks for the comment!

It was definitely not about being contrarian but about offering first and foremost a more cost effective but still relatively simple, scalable and robust alternative to their current setup.

They have a single small team of less than a dozen people, all working on a single application, with a single frontend component.

Imagine instead this team managing a K8s setup with DNS, ALB and SSL controllers that each set up a single resource. I personally find that overkill.

waffletower3y ago· 6 in thread

The management of infrastructure via Terraform has a hidden engineering cost that should also be considered. Engineers can much more easily maintain, learn and introspect infrastructure via Kubernetes, despite its own complexity, given the immature, inconsistent and undeniably awkward qualities of the Terraform toolchain. Engineering time is expensive -- the morass of Terraform can easily quadruple engineering efforts.

twalla3y ago

As a something of a k8s maximalist, I kind of disagree here. I think, especially for early stage and smaller teams, TF ends up being "closer to the metal" in the sense that there are fewer concepts and abstractions that need to be understood before an engineer can build a model of the resources they want, how they are grouped and how state reconciliation works. With k8s you're really just trading out crappy third party modules and providers for crappy operators and controllers.

I do think that as organizations grow, the ability for components to be defined in smaller units without being enmeshed in a big-ass tf dependency graph is a big draw of the controller model. The flipside is this comes with accepting the operational overhead of k8s plus the attendant controllers/operators you're running and hiring/staffing accordingly. There are ways you can structure your terraform that avoids creating the tight coupling some folks don't like where you have to literally define the entire universe to change a machine image. Not to mention, there do exist tools that allow you to inspect and visualize tf state.

pharmakom3y ago

I think Terraform should consider making targeted applies a first class feature/workflow.

Right now, Terraform maximalism requires reproducible builds, which is not something most orgs can achieve.

KaiserPro3y ago

> Engineers can much more easily maintain, learn and introspect infrastructure via Kubernetes, despite its own complexity.

Citation needed.

K8s has a whole bunch of footguns that people who don't want to manage infra can easily blunder into.

Terraform and ecs is not immature, and its fairly simple to maintain especially if they are just pushing updates without significant infra changes. (ie bumping the container version)

> Engineering time is expensive

which is why ECS is probably better, because its good enough for running a few containers that talk to a load balancer.

frank_nitti3y ago

ECS will always have the major disadvantage of strongly coupling your infrastructure and often your code to AWS.

They will continue to make it more appealing to lock your software into their platform than to go with their thinner facilities for OSS, doing the minimum to keep up to date with trends in open source, just enough to lure you in and create “easier” paths until you can’t afford to leave.

We have this problem with Azure - sure it’s easier to get a knucklehead to push buttons and get an app running, but after years you’ll be scrambling to reduce costs. Good luck with that when all of your terraforms use Azure Resource Manager and all of your source code uses Azure Functions. Being stuck with microsoft/amazon and a team of engineers who spent their time learning vendor-specific skills instead of the open source tech that enables it, sounds awful.

1 more reply

lijok3y ago

> Engineers can much more easily maintain, learn and introspect infrastructure via Kubernetes

Hahahaha

LostLocalMan3y ago

I personally find using CDK over terraform to actually have a performance multiplier rather than a cost. So much so that I end up using CDK8s to manage my Kubernetes resources as well

pid-13y ago· 4 in thread

IMO ECS is in a weird position right now, because:

1 - It's simpler thank K8s, but not that much simpler than your avg managed K8s offering

2 - It really locks you in the AWS ecosystem

3 - It is way less used than K8s or just running things on servers, so there are way less help / learning resources

I really don't see how using ECS is much better than EC2 + compose for small setups and this post didn't provide many good arguments to convince me.

dabeeeenster3y ago

ECS is just docker instances. I don't really see how that locks you in.

moduspol3y ago

The IaC / knowledge you need to know to use ECS in a way that replaces EC2 + compose is minimal, bordering on negligible.

I'd use it on day 1 (over EC2 + compose) just to avoid managing an OS or deployment infrastructure.

pharmakom3y ago

EC2 runs AMIs but ECS runs Docker images. The development experience with Docker containers is a bit smoother.

sigstoat3y ago

> It really locks you in the AWS ecosystem

the bar for being "locked in" seems to drop further every day.

topspin3y ago· 3 in thread

I'm tracking cloud-hypervisor and kata containers closely. I'm convinced there is a unicorn opportunity here for the SME/private-cloud world. An easily managed cluster of lightweight, live-migratable, hardware isolated VMs running containers (as opposed to just herding containers) solves problems people actually have, as opposed to the problems k8s solves. k8s is fine for the scale of enterprise for which it is actually intended and the problem space it was designed to address. It's not fine for everything else.

dilyevsky3y ago

Kata container is just OCI-compatible kvm so what business need does it solve for your general acme corp that a standard docker/containerd/crio container doesn’t?

topspin3y ago

You've asked. I've taken the time to write an answer. Please read it.

Acme corp loves containers as much as everyone else. Containers provide great value. However, muddling around with docker/containerd/crio without some form of orchestration is just another path to a herd of fragile, neglected pet machines.

Acme corp is very different from the Big Tech world k8s came from. Acme corp doesn't have Linux kernel contributors and language developers and an IT payroll so large that the mundane devops people are lost in the noise. Acme corp must use what prevails and doesn't mystify. The "team" managing something is frequently one person, or less.

Acme corp ends up with a collection of pet VMs, all different. Lots of stuff is containerized. Some stuff isn't. Much of it is high-value: let one of those go down and an angry so-and-so will be on the horn right now, even if they haven't noticed for weeks. Most of it is low load: there will never ever be a world where these get reworked into scalable, stateless, distributed cloud apps.

How to get from a herd of pet VMs that happen to run containers (sometimes) to an orchestrated cluster of containers?

In my imagination the answer is something that looks like a mashup of Proxmox and docker-compose. It has the following features:

-- Orchestration: micro-VMs running containers scheduled across a cluster of nodes. The "micro-VM" term deserves some definition. I don't have a precise definition. I know Firecracker is too anemic and full featured VMs are too much. The micro-VMs of cloud-hypervisor are just about right. Above all "micro" just means simple, not necessarily small: a micro-VM that needs a lot of RAM and takes longer then 0.0003 us to start is fine.

-- Live migration: low-load, high-value applications need to stay up despite cluster node maintenance and despite never becoming candidates for re-engineering into cloud native applications. This feature is the #1 reason the VM part is necessary: live-migration is a native capability of KVM et al. that works well since forever, whereas containers (CRIU not withstanding) can't be live-migrated.

-- Trivially simple support of network transparent block storage: iSCSI and other network block storage is rampant at Acme corp because it's cheap, reliable, easy and fast enough. Re-engineering everything for dynamodb or whatever isn't an option. Fortunately, because we're running a micro-VM with its own kernel that has native support for network block (The other #1 reason for the VM part) we get this for free.

-- Simple operation: if it imposes a bunch of concepts that one can't already find in docker-compose it's wrong. Acme corp doesn't have the depth to deal with more and can't find that depth even if it wanted to, which it doesn't. Grug Brained Devops: not stupid, just instinctually uninterested in unnecessary abstraction, opaque jargon terminology, overengineering and fads.

Anyhow, that's my sincere attempt to answer your question. Respectfully, if you think you know of a solution you're likely wrong: I've wormed into every corner of that which prevails and it doesn't exist at the moment. That's why I claim there is an opportunity. I'm happy to be proven wrong, but you'd have to go a long way.

20thr3y ago

We have a similar vision at namespace.so; we are starting with development and testing. But that’s the start.

(disclaimer: I’m part of the team)

what-the-grump3y ago· 3 in thread

Static host stuff on an S3 bucket / static web app. Blob storage account with a table, maybe an on-demand function app.

Sub $15/mo to run your thing until you get real demand, yeah. But its not new, the K8S shtick is coming from investors not tech people. And if its coming from the tech people throw them out of the door.

Why are you cooking for 8000 people when 6 are coming over? Why are you building a kitchen to cook for 8000 people. Why are you renting space to fit 8000 people.

You need a table and maybe 6 chairs who knows they might eat standing.

hdjjhhvvhga3y ago

> the K8S shtick is coming from investors not tech people.

Not necessarily. If you need to deal with many containerized apps that are updated and deployed regularly, k8s is a really great tool.

As a rule of thumb, I'd say < 5 - no, > 20 - yes, and everything in between - up to you.

zimpenfish3y ago

> Why are you cooking for 8000 people when 6 are coming over?

Place I worked at had a service running on K8s with, I think, 4 pods, and it got on average one hit every 2-3 seconds during office hours (and virtually none outside those.)

hdjjhhvvhga3y ago

So I'd say the number of pods was appropriate.

1 more reply

honkycat3y ago· 2 in thread

I was not convinced by this article that ECS was the right choice. It felt more like a contrarian choice.

> ECS is also relatively simple and not so far from their Docker-compose setup, but much more flexible and scalable. It also enables us to convert their somewhat stateful pets to identically looking stateless cattle that could be converted to Spot instances later.

Have you ever built something in ECS? I have, and it is missing HUGE SWATHS of the convenient functionality that EKS provides. It lacks the network effect of being a widely-used product, so searching for issues is a constant issue. It breaks and nobody knows how to help.

"Not far from their docker-compose setup..." What are you even talking about? ECS is massively more complex than docker-compose and the main similarity I see between them is that they both run docker. It's similar to docker-compose if you ignore the fact that you need permissions, load balancers, networking, etc. Which is the hard part, NOT running some containers on EC2, by the way.

It has it's own bizarre and verbose container deployment spec that is less portable, less flexible, less feature-ful, and less widely used than EKS.

> ECS will also offer ECS container logs and metrics out of the box, giving us better visibility into the application and enabling us to right-size each service based on its actual resource consumption, in the end allowing us to reduce the number of instances in the ECS cluster once everything is optimized.

Something you also get with EKS. So half of the reasons you have claimed ECS was the right choice are now in the garbage.

What you DON'T get with ECS is awesome working-out-of-the-box open source software like External Secrets, External DNS, LetsEncrypt, the Amazon Ingress Controller, argo rollouts, services, ingresses, cronjobs... I could go on and on.

They are going to try and hire DevOps engineers, and they will all have to ramp up ( and likely complain about ) ECS instead of having people walk on already prepared and ready to start implementing high quality software on a system they already know.

bdcravens3y ago

> What you DON'T get with ECS is awesome working-out-of-the-box open source software like External Secrets, External DNS, LetsEncrypt, the Amazon Ingress Controller, argo rollouts, services, ingresses, cronjobs... I could go on and on.

The AWS ecosystem has much of this baked-in. (Parameter Store, Certificate Manager, etc) Vendor lock-in is of course a concern, but for many, a theoretical one.

jpgvm3y ago

The main problem with the AWS ecosystem is you generally need to code against it directly. Much of the OSS stuff is designed to have a much more drop-in feeling, especially if you are going with stuff like Spring Cloud etc to abstract over things for you.

If you can choose an option that is going to be way less work even if it's "more complex" that is often the right choice as long as you understand what that complexity is and can pierce through the covers if necessary.

1 more reply

pmarreck3y ago· 2 in thread

The article never mentions what ECS stands for, that’s not yet a TLA I’m familiar with

Torwald3y ago

It stands for Enhanced Chip Set as improvement from OCS, which stands for Original Chip Set. Note that even ECS found it's successor AGA, which stands for Advanced Graphics Architecture. AGA can display 256 colors even without using HAM (hold and modify).

lapser3y ago

Assuming you're joking, ECS here stands up Elastic Container Service.

rdsubhas3y ago

What they didn't appear to have considered – was the Dev side of DevOps. Kubernetes runs on developer machines and single-node CI agents. In my company, all CI agents are single-node k3s clusters, all our engineers kubectl apply their services there for integration and e2e testing, same environment from dev to prod. We provide the same single-node VMs for development on the cloud, and Podman desktop for local kubernetes. It has hooks to inject stuff (injecting centralized secrets, configuration, sidecars, etc) in a single way, no need to implement centralized features separately for CI and separately for Prod. It has hooks to validate & reject stuff that doesn't comply with org policies (e.g. limit only core workloads, upper bounds on cpu/memory, volumes, validate everyone sticks to core workload specs and do not use any alpha/beta APIs, etc) so that SRE can allow decentralization while still being in control of what runs and how.

ECS is a deployment tool. Kubernetes is a dev-to-ci-to-prod tool, providing same environment for standard workload specs across the full development cycle, and a single way to inject common features into the standard workloads.

LispSporks223y ago

The last company I worked at was building a Kubernetes cluster. It was the usual story – "Heroku is way too expensive. How hard can it be to build our own Heroku?" Classic trap. Fell right into it. Company size: maybe 200. Tried to tell them it will be a huge time suck, and they were doing it Azure, and then EKS IIRC. Tried to explain that massive companies have whole departments in charge of building and maintaining that and it's an entire hobby for some masochists. I think they're probably still building it.

e12e3y ago

Interesting that the before and after figures are isomorphic as far as i can tell?

They introduced Terraform and dropped docker compose in favour of some Amazon proprietary container scheduler?

fwungy3y ago

ECS is hardly perfect, but I'd use it before EKS for a client who wasn't ready for that.

bdcravens3y ago

We have run ECS with great success for several years. It has always appeared to me to be 80% of K8S for 20% of the effort, but for us, that 80% contained 100% of our need.

mediascreen3y ago

I would almost always go with what the team or someone on it was most familiar with and can setup in less than a day. I think it should include an easy way to scale at least for a few months to come, a reasonable way to provision more capacity, a managed database, a CDN, backups, access and error logging and a simple but automatic deployment pipeline.

At work we use ECS Fargate, Aurora MySQL and Bitbucket pipelines to host a little over 100 client web applications. It takes about an hour to configure a new AWS account and staging/production environments for a new client using Cloudformation (and a number manual steps) and the monthly AWS cost is around $100. There are cheaper ways and probably easier ways, but we feel like we have reached a good balance between stability, ease of use, cost and features. And we are not that worried about being tied to AWS.

efnx3y ago

I expected to read some wild story about Entity Component Systems but was disappointed to find it’s about picking the correct AWS services.

ddalcino3y ago

Archive.org link: https://web.archive.org/web/20230608135300/https://leanerclo...

I think it got the HN hug of death

sharkbot3y ago

Mostly a note to self: it is interesting to read this account and connect it to the financial planning case studies that show up in personal finance blogs and articles. It seems like there’s a lot of shared terminology and practice between the domains.

mixxit3y ago

we have retargeted some of our infrastructure from kubernetes and onto ecs fargate for the last 12 months and it has massively reduced errors and support tickets and also the cost

unfortunately this is a deal with the devil for vendor lock-in

j / k navigate · click thread line to collapse

123 comments

85 comments · 19 top-level

itsmemattchung3y ago· 30 in thread

> When looking at the cloud resources, we noticed many On-Demand EC2 instances with relatively low CPU utilization, which can be expected considering they don't have customers yet.

ljm3y ago

I’ve been exploring this lately because, honestly, the cloud is total overkill for small startups and hobby projects.

clvx3y ago

CharlesW3y ago

> …the cloud is total overkill for small startups and hobby projects.

It absolutely can be, sure. But solutions like Vercel, Cloudflare Workers, Supabase, etc. can be excellent and inexpensive for those use cases.

3 more replies

TheNewsIsHere3y ago

That’s just not a realistic or necessary approach for everyone.

AWS is engineered for excruciatingly detailed billing right down to the moment you’re consuming or releasing capacity, and that’s how they built it. Managing that spend is exhausting.

ecshafer3y ago

bamfly3y ago

> But if more developers just learned how to make a website on linux, with a db, a webserver, and an application.

1 more reply

sigstoat3y ago

they also had some rabbitmq-on-k8s system going that fell over during small tests because they couldn’t get k8s to actually scale it. (which then convinced them they needed k8s, and bigger nodes)

sigh

interroboink3y ago

The promise of cloud infrastructure is that it can scale to fit demand — start small, and grow as needed. But sometimes the truth is that it just lets people spend money more easily (:

Back in the day, it would have required a whole procedure to buy that hardware, have it set up, etc. Now you can needlessly spend $10k per month with just a few clicks!

1 more reply

waffletower3y ago

And in that case, why ec2, why not a more affordable provider?

grogenaut3y ago

jmholla3y ago

Who would you recommend as a more affordable provider?

1 more reply

mywittyname3y ago

There's a lot of expertise in AWS-land.

2 more replies

alien_OP3y ago

The OP here, thanks for your comment.

To be honest I wasn't hired to challenge their entire setup, only to make it more cost effective.

So I chose the most straightforward way I could think of that would allow us to come up with a cost effective setup that will be scalable, fault tolerant and simple to maintain later on.

It all probably started with such a single instance running Docker compose, but then over time it evolved into this setup.

The ideal setup I mentioned would have been also cost effective, scalable and resilient.

politician3y ago

I recently spoke with some folks who declined to invest because our solution was too simple: specifically, the fact that we don't use Kubernetes was a negative signal.

That's baffling to me, but that perspective is out there too.

nemothekid3y ago

>ask why they are building such a robust distributed system — SQS, SNS, etc — without any customers

HatchedLake7213y ago

What SQS has to do with EC2?

One is a queuing service, the other one is a VM.

So instead of using SQS that has $0 cost when there are no customers, you suggest I install, configure and run RabbitMQ on an EC2, to save $0 when there are no customers?

Or save $1 when I have 100 customers? SQS is dirt cheap.

The point of SQS or any other usage-based AWS _developer_ service compared to DIY is that you can be up and running in minutes at a minuscule cost.

I agree with you about over-engineering and building a distributed "microservices" architecture when you have no customers.

Nextgrid3y ago

> when I need queueing functionality to increase my developer velocity so I can focus on building value rather than wasting my life installing, configuring and running anything on EC2.

djbusby3y ago

The question is: what are queueing for zero customers?

2 more replies

RcouF1uZ4gsC3y ago

A single EC2 instance with SQLite as the database can get you pretty far.

renewiltord3y ago

Yeah that's a good starting point. Maybe just docker on those when you have two apps so they don't step on each other.

whatever13y ago

Exactly. Worry about scaling when scaling is in the horizon

x86x873y ago

No no no. We want to be like Google. Web Scale. Big big data. Huuge

taeric3y ago

It is rather amusing how over engineered most seed projects have a tendency to be.

I do think ddb and lambda hit a sweet spot for costs on ramping up. The rest, though, really struggle.

mlhpdx3y ago

1 more reply

0xEFF3y ago

A single EC2 instance is an equally bad trade-off on the opposite side of the spectrum from over architected SQS, SNS, etc…

intelthrow63y ago

I don’t see the reasoning.

A startup that outgrows an EC2 server will be making enough money to hire more people to scale the system properly than what was initially designed: trading away everything for development velocity.

3 more replies

dangus3y ago

Why would you plan not to have customers? Don't you think the company is able to forecast demand for a new product launch?

Disney: We'd like to launch a new streaming service.

Consultant: Great! You have no customers right now so you can run it on a singleton EC2 instance until you outgrow that scale!

Disney: ...We expect 20 million people to sign up in the first week

Dylan168073y ago

> Don't you think the company is able to forecast demand for a new product launch?

I'm pretty sure "follow the forecast" is exactly what motivated that post.

In other words, the infrastructure is overkill for the initial forecast of customers.

They're not working for Disney.

1 more reply

justrealist3y ago

SQS and SNS are a perfectly good primitives for building a robust distributed system that costs $0 when not in use, by triggering compute via Lambda or Batch.

Your comment is really pretty ignorant of how these tools interact. Using serverless primitives is the opposite of leaving nodes running for no reason.

boredumb3y ago

This is 100% the line of questioning to pursue.

danpalmer3y ago· 8 in thread

More accurately: "Given using AWS as a requirement, I recommended ECS instead of K8s".

TurningCanadian3y ago

The nice thing about a standard like K8S is how there are other clients to choose from.

You can do everything from the CLI with kubectl of course, but there are also a bunch of apps that will work with any K8S cluster:

https://medium.com/dictcp/kubernetes-gui-clients-in-2020-kub...

It's very nice to have a consistent interface across multiple cloud providers.

jorams3y ago

Since the post only mentions Kubernetes once, I don't really understand why it's in the title at all.

The control plane cost makes sense, but I can't imagine learning Terraform to set up ECS is that much easier than learning Yaml to configure k8s. Unless EKS is much harder to use than GKE.

danpalmer3y ago

Also the control plane cost is basically irrelevant at any real scale, I think it's pretty much there to discourage hobby projects from taking up a free VM.

benced3y ago

It is - EKS had fewer features.

fnordpiglet3y ago

nova220333y ago

Eventually EKS was built to satisfy customers that insisted these issues were just FUD from aws to lock customers into the aws infrastructure.

I mean..the customers are not wrong.

1 more reply

dilyevsky3y ago

cmbothwell3y ago

Not sure why you’re being downvoted, this is a very interesting history.

lapser3y ago· 8 in thread

I find it really wild that anyone would ever recommend ECS. A developer deploying a service involves:

- Setting up certs (managed as TF) - Setting up ALBs (managed as TF) - Setting up the actual service definition (often done as a JSON, that is passed into TF)

Possibly other things I'm forgetting.

Some other things. It requires a *developer* to know about certs and ALBs and whatever else.

tedivm3y ago

jpgvm3y ago

Handing Terraform to developers has it's own host of issues.

A major benefit of k8s that is usually massively overlooked is it's RBAC system and specifically how nice a namespace per team or per service model can be.

3 more replies

alien_OP3y ago

The OP here, thanks for the comment!

Why does the developer need to care about the certs and ALBs? The devops engineer you need to set up all those controllers could as well deploy those resources from Terraform.

As I showed in the diagrams from the article this application has a single ALB and a single cert per environment and the internal services only talk to each other through the rabbit MQ queue.

DNS, ALB and TLS certs could be easily handled from just a few lines of Terraform, and nobody needs to touch it ever again.

With EKS you would need multiple controllers and multiple annotations controlling them, and then each controller will end up setting up a single resource per environment.

The controllers make sense if you have a ton of distinct applications sharing the same clusters, but this is not the case here, and would be overkill.

lapser3y ago

> DNS, ALB and TLS certs could be easily handled from just a few lines of Terraform, and nobody needs to touch it ever again.

Welcome to reality, where this is not the case.

I'm currently working at a company where we're using TF and ECS, and app specific infra is supposedly owned by the service developers.

NovemberWhiskey3y ago

moduspol3y ago

honkycat3y ago

Agreed. After recently finishing up a migration off ECS, it is madness and feels like OP just wanted a contrarian take.

Honestly, if they had said: "So instead we set up some bare-metal EC2 instances" I would be on-board.

alien_OP3y ago

The OP here, thanks for the comment!

It was definitely not about being contrarian but about offering first and foremost a more cost effective but still relatively simple, scalable and robust alternative to their current setup.

They have a single small team of less than a dozen people, all working on a single application, with a single frontend component.

Imagine instead this team managing a K8s setup with DNS, ALB and SSL controllers that each set up a single resource. I personally find that overkill.

waffletower3y ago· 6 in thread

twalla3y ago

pharmakom3y ago

I think Terraform should consider making targeted applies a first class feature/workflow.

Right now, Terraform maximalism requires reproducible builds, which is not something most orgs can achieve.

KaiserPro3y ago

> Engineers can much more easily maintain, learn and introspect infrastructure via Kubernetes, despite its own complexity.

Citation needed.

K8s has a whole bunch of footguns that people who don't want to manage infra can easily blunder into.

Terraform and ecs is not immature, and its fairly simple to maintain especially if they are just pushing updates without significant infra changes. (ie bumping the container version)

> Engineering time is expensive

which is why ECS is probably better, because its good enough for running a few containers that talk to a load balancer.

frank_nitti3y ago

ECS will always have the major disadvantage of strongly coupling your infrastructure and often your code to AWS.

1 more reply

lijok3y ago

> Engineers can much more easily maintain, learn and introspect infrastructure via Kubernetes

Hahahaha

LostLocalMan3y ago

I personally find using CDK over terraform to actually have a performance multiplier rather than a cost. So much so that I end up using CDK8s to manage my Kubernetes resources as well

pid-13y ago· 4 in thread

IMO ECS is in a weird position right now, because:

1 - It's simpler thank K8s, but not that much simpler than your avg managed K8s offering

2 - It really locks you in the AWS ecosystem

3 - It is way less used than K8s or just running things on servers, so there are way less help / learning resources

I really don't see how using ECS is much better than EC2 + compose for small setups and this post didn't provide many good arguments to convince me.

dabeeeenster3y ago

ECS is just docker instances. I don't really see how that locks you in.

moduspol3y ago

The IaC / knowledge you need to know to use ECS in a way that replaces EC2 + compose is minimal, bordering on negligible.

I'd use it on day 1 (over EC2 + compose) just to avoid managing an OS or deployment infrastructure.

pharmakom3y ago

EC2 runs AMIs but ECS runs Docker images. The development experience with Docker containers is a bit smoother.

sigstoat3y ago

> It really locks you in the AWS ecosystem

the bar for being "locked in" seems to drop further every day.

topspin3y ago· 3 in thread

dilyevsky3y ago

Kata container is just OCI-compatible kvm so what business need does it solve for your general acme corp that a standard docker/containerd/crio container doesn’t?

topspin3y ago

You've asked. I've taken the time to write an answer. Please read it.

How to get from a herd of pet VMs that happen to run containers (sometimes) to an orchestrated cluster of containers?

In my imagination the answer is something that looks like a mashup of Proxmox and docker-compose. It has the following features:

20thr3y ago

We have a similar vision at namespace.so; we are starting with development and testing. But that’s the start.

(disclaimer: I’m part of the team)

what-the-grump3y ago· 3 in thread

Static host stuff on an S3 bucket / static web app. Blob storage account with a table, maybe an on-demand function app.

Why are you cooking for 8000 people when 6 are coming over? Why are you building a kitchen to cook for 8000 people. Why are you renting space to fit 8000 people.

You need a table and maybe 6 chairs who knows they might eat standing.

hdjjhhvvhga3y ago

> the K8S shtick is coming from investors not tech people.

Not necessarily. If you need to deal with many containerized apps that are updated and deployed regularly, k8s is a really great tool.

As a rule of thumb, I'd say < 5 - no, > 20 - yes, and everything in between - up to you.

zimpenfish3y ago

> Why are you cooking for 8000 people when 6 are coming over?

Place I worked at had a service running on K8s with, I think, 4 pods, and it got on average one hit every 2-3 seconds during office hours (and virtually none outside those.)

hdjjhhvvhga3y ago

So I'd say the number of pods was appropriate.

1 more reply

honkycat3y ago· 2 in thread

I was not convinced by this article that ECS was the right choice. It felt more like a contrarian choice.

It has it's own bizarre and verbose container deployment spec that is less portable, less flexible, less feature-ful, and less widely used than EKS.

Something you also get with EKS. So half of the reasons you have claimed ECS was the right choice are now in the garbage.

bdcravens3y ago

The AWS ecosystem has much of this baked-in. (Parameter Store, Certificate Manager, etc) Vendor lock-in is of course a concern, but for many, a theoretical one.

jpgvm3y ago

1 more reply

pmarreck3y ago· 2 in thread

The article never mentions what ECS stands for, that’s not yet a TLA I’m familiar with

Torwald3y ago

lapser3y ago

Assuming you're joking, ECS here stands up Elastic Container Service.

rdsubhas3y ago

LispSporks223y ago

e12e3y ago

Interesting that the before and after figures are isomorphic as far as i can tell?

They introduced Terraform and dropped docker compose in favour of some Amazon proprietary container scheduler?

fwungy3y ago

ECS is hardly perfect, but I'd use it before EKS for a client who wasn't ready for that.

bdcravens3y ago

We have run ECS with great success for several years. It has always appeared to me to be 80% of K8S for 20% of the effort, but for us, that 80% contained 100% of our need.

mediascreen3y ago

efnx3y ago

I expected to read some wild story about Entity Component Systems but was disappointed to find it’s about picking the correct AWS services.

ddalcino3y ago

Archive.org link: https://web.archive.org/web/20230608135300/https://leanerclo...

I think it got the HN hug of death

sharkbot3y ago

mixxit3y ago

we have retargeted some of our infrastructure from kubernetes and onto ecs fargate for the last 12 months and it has massively reduced errors and support tickets and also the cost

unfortunately this is a deal with the devil for vendor lock-in

j / k navigate · click thread line to collapse