Reclaim the Stack (opens in new tab)

(reclaim-the-stack.com)

496 pointsdustedcodes1y ago321 comments

321 comments

181 comments · 38 top-level

jusomg1y ago· 24 in thread

Of course you reduced 90% of the cost. Most of these costs don't come from the software, but from the people and automation maintaining it.

With that cost reduction you also removed monitoring of the platform, people oncall to fix issues that appear, upgrades, continuous improvements, etc. Who/What is going to be doing that on this new platform and how much does that cost?

Now you need to maintain k8s, postgresql, elasticsearch, redis, secret managements, OSs, storage... These are complex systems that require people understanding how they internally work, how they scale and common pitfalls.

Who is going to upgrade kubernetes when they release a new version that has breaking changes? What happens when Elasticsearch decides to splitbrain and your search stops working? When the DB goes down or you need to set up replication? What is monitoring replication lag? Or even simply things like disks being close to full? What is acting on that?

I don't mean to say Heroku is fairly priced (I honestly have no idea) but this comparison is not apples to apples. You could have your team focused on your product before. Now you need people dedicated to work on this stuff.

mlinhares1y ago

Anything you don't know about managing these systems can be learned asking chatgpt :P

Whenever I see people doing something like this I remember I did the same when I was in 10 people startups and it required A LOT of work to keep all these things running (mostly because back then we didn't have all these cloud managed systems) and that time would have been better invested in the product instead of wasting time figuring out how these tools work.

I see value in this kind of work if you're at the scale of something like Dropbox and moving from S3 will greatly improve your bottom line and you have a team that knows exactly what they're doing and will be assigned the maintenance of this work. If this is being done merely from a cost cutting perspective and you don't have the people that understand these systems, its a recipe for disaster and once shit is on fire the people that would be assigned to "fix" the problem will quickly disappear because the "on call schedule is insane".

re-thc1y ago

> and that time would have been better invested in the product instead of wasting time figuring out how these tools work

It really depends on what you're doing. Back then a lot of non-VC startups worked better and the savings possibly helped. It also helps grow the team and have less reliance on the vendor. It's long term value.

Is it really time wasted? People often go into resume building mode and do all kinds of wacky things regardless. Perhaps this just helps scratch that itch.

1 more reply

ljm1y ago

I bailed out of one company because even though the stack seemed conceptually simple in terms of infra (there wasn't a great deal to it), the engineering more than compensated for it. The end result was the same: non-stop crisis management, non-stop firefighting, no capacity to work on anything new, just fixing old.

All by design, really, because at that point you're not part of an engineering team you're a code monkey operating in service of growth metrics.

Diederich1y ago

> ... I remember I did the same when I was in 10 people startups and it required A LOT of work to keep all these things running...

Honest question: how long ago was that? I stepped away from that ecosystem four or so years ago. Perhaps ease of use has substantially improved?

ugh1231y ago

> you also removed monitoring of the platform

You don't think they have any monitoring within Kubernetes?

I imagine they have more monitoring capabilities now than they did with Heroku.

almost1y ago

The fact that HN seems to think this is "FUD" is absolutely wild. You just talked about (some of) the tradeoffs involved in running all this stuff yourself. Obviously for some people it'll be worth and for others not, but absolutely amazing that there are people who don't even seem to accept that those tradoffs exist!

dzikimarian1y ago

I assume you reference my comment.

The reason I think parent comment is FUD isn't because I don't acknowledge tradeoffs (they are very real).

It's because parent comment implies that people behind "reclaim the stack" didn't account for the monitoring, people's cost etc.

Obviously any reasonable person making that decision includes it into calculation. Obviously nobody sane throws entire monitoring out of the window for savings.

Accounting for all of these it can be still viable and significantly cheaper to run own infra. Especially if you operate outside of the US and you're able to eat an initial investment.

1 more reply

kmacdough1y ago

Exactly. It all depends on your needs and — to be honest — the quality of your sysops engineering. You may not only need dedicated sysops, but you may incur higher incidental costs with lost productivity when your solution inevitably goes down (or just from extra dev load when things are harder to use).

That said, at least in 2016 Heroku was way overpriced for high volume sites. My startup of 10 engineers w/ 1M monthly active users saved 300k+/yr switching off heroku. But we had Jerry. Jerry was a beast and did most of the migration work in a month, with some dead-simple AWS scaling. His solution lacked many of the features of Heroku, but it massively reduced costs for developers running full test stacks which, in turn increased internal productivity. And did I mention it was dead simple? It's hard to overstate how valuable this was for the rest of us, who could easily grok the inner workings and know the consequences of our decisions.

Perhaps this stack will open that opportunity to less equipped startups, but I've found few open source "drop-in replacements" to be truly drop-in. And I've never found k3 to be dead simple.

dzikimarian1y ago

Sorry, but that's just ton of FUD. We run both private cloud and (for a few customers) AWS. Of course you have more maintenance on on-prem, but typical k8s update is maybe a few hours of work, when you know what you are doing.

Also AWS is also, complex, also requires configuration and also generates alerts in the middle of the night.

It's still a lot cheaper than managed service.

jusomg1y ago

> Of course you have more maintenance on on-prem, but typical k8s update is maybe a few hours of work, when you know what you are doing.

You just mentioned one dimension of what I described, and "when you know what you are doing" is doing a lot of the heavy lifting in your argument.

> Also AWS is also, complex, also requires configuration and also generates alerts in the middle of the night.

I'm confused. So we are on agreement there?

I feel you might be confusing my point with an on-prem vs AWS discussion, and that's not it.

This is encouraging teams to run databases / search / cache / secrets and everything on top of k8s and assuming a magic k8s operator is doing the same job as a team of humans and automation managing all those services for you.

1 more reply

filleokus1y ago

As with everything it's not black or white, but rather a spectrum. Sure, updating k8s is not that bad, but operating a distributed storage solution is no joke. Or really anything that requires persistence and clustering (like elastic).

You can also trade operational complexity for cash via support contracts and/or enterprise solutions (like just throwing money at Hitachi for storage rather than trying to keep Ceph alive).

1 more reply

tinco1y ago

Sure, but it requires that your engineers are vertically capable. In my experience, about 1 in 5 developers has the required experience and does not flat out refuse to have vertical responsibility over their software stack.

And that number might be high, in larger more established companies there might be more engineers who want to stick to their comfort bubble. So many developers reject the idea of writing SQL themselves instead of having the ORM do it, let alone know how to configure replication and failover.

I'd maybe hire for the people who could and would, but the people advocating for just having the cloud take care of these things have a point. You might miss out on an excellent application engineer, if you reject them for not having any Linux skills.

1 more reply

dbackeus1y ago

Original creator and maintainer of Reclaim the Stack here.

> you also removed monitoring of the platform

No we did not: Monitoring: https://reclaim-the-stack.com/docs/platform-components/monit...

Log aggregation: https://reclaim-the-stack.com/docs/platform-components/log-a...

Observability is on the whole better than what we had at Heroku since we now have direct access to realtime resource consumption of all infrastructure parts. We also have infinite log retention which would have been prohibitively expensive using Heroku logging addons (though we cap retention at 12 months for GDPR reasons).

> Who/What is going to be doing that on this new platform and how much does that cost?

Me and my colleague who created the tool together manage infrastructure / OS upgrades and look into issues etc. So far we've been in production 1.5 years on this platform. On average we spent perhaps 3 days per month doing platform related work (mostly software upgrades). The rest we spend on full stack application development.

The hypothesis for migrating to Kubernetes was that the available database operators would be robust enough to automate all common high availability / backup / disaster recovery issues. This has proven to be true, apart from the Redis operator which has been our only pain point from a software point of view so far. We are currently rolling out a replacement approach using our own Kubernetes templates instead of relying on an operator at all for Redis.

> Now you need to maintain k8s, postgresql, elasticsearch, redis, secret managements, OSs, storage... These are complex systems that require people understanding how they internally work

Thanks to Talos Linux (https://www.talos.dev/), maintaining K8s has been a non issue.

Running databases via operators has been a non issue, apart from Redis.

Secret management via sealed secrets + CLI tooling has been a non issue (https://reclaim-the-stack.com/docs/platform-components/secre...)

OS management with Talos Linux has been a learning curve but not too bad. We built talos-manager to manage bootstrapping new nodes to our cluster straight forward (https://reclaim-the-stack.com/docs/talos-manager/introductio...). The only remaining OS related maintenance is OS upgrades, which requires rebooting servers, but that's about it.

For storage we chose to go with simple local storage instead of complicated network based storage (https://reclaim-the-stack.com/docs/platform-components/persi...). Our servers come with datacenter grade NVMe drives. All our databases are replicated across multiple servers so we can gracefully deal with failures, should they occur.

> Who is going to upgrade kubernetes when they release a new version that has breaking changes?

Ugrading kubernetes in general can be done with 0 downtime and is handled by a single talosctl CLI command. Breaking changes in K8s implies changes to existing resource manifest schemas and are detected by tooling before upgrades occur. Given how stable Kubernetes resource schemas are and how averse the community is to push breaking changes I don't expect this to cause major issues going forward. But of course software upgrades will always require due diligence and can sometimes be time consuming, K8s is no exception.

> What happens when ElasticSearch decides to splitbrain and your search stops working?

ElasticSearch, since major version 7, should not enter split brain if correctly deployed across 3 or more nodes. That said, in case of a complete disaster we could either rebuild our index from source of truth (Postgres) or do disaster recovery from off site backups.

It's not like using ElasticCloud protects against these things in any meaningfully different way. However, the feedback loop of contacting support would be slower.

> When the DB goes down or you need to set up replication?

Operators handle failovers. If we would lose all replicas in a major disaster event we would have to recover from off site backups. Same rules would apply for managed databases.

> What is monitoring replication lag?

For Postgres, which is our only critical data source. Replication lag monitoring + alerting is built into the operator.

It should be straight forward to add this for Redis and ElasticSearch as well.

> Or even simply things like disks being close to full?

Disk space monitoring and alerting is built into our monitoring stack.

At the end of the day I can only describe to you the facts of our experience. We have reduced costs to cover hiring about 4 full time DevOps people so far. But we have hired 0 new engineers and are managing fine with just a few days of additional platform maintenance per month.

That said, we're not trying to make the point that EVERYONE should Reclaim the Stack. We documented our thoughts about it here: https://reclaim-the-stack.com/docs/kubernetes-platform/intro...

troupo1y ago

Since you're the original creator, can you open the site of your product, and find the link to your project that you open sourced?

- Front page links to docs and disord.

- First page of docs only has a link to discord.

- Installation references a "get started" repo that is... somehow also the main repo, not just "get started"?

1 more reply

swat5351y ago

Assuming average salary of 140k/year, you are dedicating 2 resources 3 times a month and this is already costing you ~38k/year on salaries alone and that's assuming your engineers have somehow mastered_both_ devops and software (very unlikely) and that they won't screw anything up. I'm not even counting the time it took you to migrate away..

This also assumes your infra doesn't grow and requires more maintenance or you have to deal with other issues.

Focusing on building features and generating revenue is much valuable than wasting precious engineering time maintain stacks.

This is hardly a "win" in my book.

2 more replies

ozgune1y ago

Hey there, this is a comprehensive and informative reply!

I had two questions just to learn more.

* What has been your experience with using local NVMes with K8s? It feels like K8s has some assumptions around volume persistence, so I'm curious if these impacted you at all in production.

* How does 'Reclaim the Stack' compare to Kamal? Was migrating off of Heroku your primary motivation for building 'Reclaim the Stack'?

Again, asking just to understand. For context, I'm one of the founders at Ubicloud. We're looking to build a managed K8s service next and evaluating trade-offs related to storage, networking, and IAM. We're also looking at Kamal as a way to deploy web apps. This post is super interesting, so wanted to learn more.

1 more reply

cryptonector1y ago

Who says they reduced costs by cutting staff? They could instead have scaled their staff better.

johnnyanmac1y ago

>Who/What is going to be doing that on this new platform and how much does that cost?

If you're already a web platform with hired talent (and someone using Heroku for a SaaS probably already is), I'd be surprised if the marginal cost was 10x.that paid support is of course coming at a premium, and isn't too flexible on what level of support you need.

And yeah, it isn't apples to apples. Maybe you are in a low CoL area and can find a decent DevOps for 80-100k. Maybe you're in SF and any extra dev will be 250k. It'll vary immensely on cost.

Nextgrid1y ago

This is FUD unless you're running a stock exchange or payment processor where every minute of downtime will cost you hundreds of thousands. For most businesses this is fear-mongering to keep the DevOps & cloud industry going and ensure continued careers in this field.

The_Colonel1y ago

It's not just about downtime, but also about not getting your systems hacked, not losing your data if sh1t hits the fan, regulation compliance, flexibility (e.g. ability to quickly spin-out new test envs) etc.

My preferred solution to this problem is different, though. For most businesses, apps, a monolith (maybe with a few extra services) + 1 relational DB is all you need. In such a simple setup, many of the problems faced either disappear or get much smaller.

1 more reply

HolyLampshade1y ago

Speaking of the exchanges (at least the sanely operated ones), there’s a reason the stack is simplified compared to most of what is being described here.

When some component fails you absolutely do not want to spend time trying to figure out the underlying cause. Almost all the cases you hear in media of exchange outages are due to unnecessary complexity added to what is already a remarkably complex distributed (in most well designed cases) state machine.

You generally want things to be as simple and streamlined as possible so when something does pop (and it will) your mean time to resolution is inside of a minute.

almost1y ago

I run a business that is a long long way from a stock exchange or a payment processor. And while a few minutes of downtime is fine 30 minutes or a few hours at the wrong time will really make my customers quite sad. I've been woken in the small hours with technical problems maybe a couple of times over the last 8 years of running it and am quite willing to pay more for my hosting to avoid that happening again.

Not for Heroku, they're absolute garbage these days, but definitely for a better run PaaS.

Plenty of situations where running it yourself makes sense of course. If you have the people and the skills available (and the cost tradeoffs make sense) or if downtime really doesn't matter much at all to you then go ahead and consider things like this (or possibly simpler self hosting options, it depdns).But no, "you gotta run kubernettes yourself unless you're a stock exchange" is not a sensible position.

1 more reply

gspencley1y ago

It's not FUD, it's pointing out a very real fact that most problems are not engineering problems that you can fix by choosing the one "magical" engineering solution that will work for all (or even most) situations.

You need to understand your business and your requirements. Us engineers love to think that we can solve everything with the right tools or right engineering solutions. That's not true. There is no "perfect framework." No one sized fits all solution that will magically solve everything. What "stack" you choose, what programming language, which frameworks, which hosting providers ... these are all as much business decisions as they are engineering decisions.

Good engineering isn't just about finding the simplest or cheapest solution. It is about understanding the business requirements and finding the right solution for the business.

1 more reply

matus_congrady1y ago

Since DHH has been promoting the 'do-it-yourself' approach, many people have fallen for it.

You're asking the right questions that only a few people know they need answers to.

In my opinion, the closest thing to "reclaiming the stack" while still being a PaaS is to use a "deploy to your cloud account" PaaS provider. These services offer the convenience of a PaaS provider, yet allow you to "eject" to using the cloud provider on your own should your use case evolve.

Example services include https://stacktape.com, https://flightcontrol.dev, and https://www.withcoherence.com.

I'm also working on a PaaS comparison site at https://paascout.io.

Disclosure: I am a founder of Stacktape.

rglover1y ago· 23 in thread

I made the mistake of falling for the k8s hype a few years back for running all of my indie hacker businesses.

Big mistake. Overnight, the cluster config files I used were no longer supported by the k8s version DigitalOcean auto upgraded my cluster to and _boom_. Every single business was offline.

Made the switch to some simple bash scripts for bootstrapping/monitoring/scaling and systemd for starting/restarting apps (nodejs). I'll never look back.

cedws1y ago

Weird how defensive people get about K8S when you say stuff like this. It’s like they’re desperately trying to convince you that you really do need all that complexity.

rollcat1y ago

I believe there's still a lot of potential for building niche / "human-scale" services/businesses, that don't inherently require the scalability of the cloud or complexity of k8s. Scaling vertically is always easier, modern server hardware has insane perf ceiling. The overall reduction in complexity is a breath of fresh air.

My occasional moral dilemma is idle power usage of overprovisioned resources, but we've found some interesting things to throw at idle hardware to ease our conscience about it.

1 more reply

alex_lav1y ago

I think it's two types of defensiveness.

1. Shovel salesman insisting all "real" gold miners use their shovels

2. Those that have already acquired shovels not wanting their purchase to be mocked/have been made in vain.

Neither are grounded in reality. Why people believe their tiny applications require the same tech that Google invented to help manage their (massive) scale is beyond me.

0perator1y ago

Most do not, but they still want all the toys that developers are building for “the cloud”.

poincaredisk1y ago

I use k8s for the last uhh 5 years and this never happened to me. In my case, because I self-host my cluster, do no unexpected upgrades. But I agree that maintaining k8s cluster takes some work.

theptip1y ago

In the 2015-2019 period there were quite a few API improvements involving deprecating old APIs, it’s much more stable/boring now. (Eg TPR -> CRD was the big one for many cluster plugins)

eddd-ddde1y ago

So either digital ocean auto updates breaking versions. Or k8s doesn't do versioning correctly. Both very bad.

Which was it?

rglover1y ago

Technically both, but more so the former.

I had a heck of a time finding accurate docs on the correct apiVersion to use for things like my ingress and service files (they had a nasty habit of doing beta versions and changing config patterns w/ little backwards compatibility). This was a few years back when your options were a lot of Googling, SO, etc, so the info I found was mixed/spotty.

As a solo founder, I found what worked at the time and assumed (foolishly, in retrospect) that it would just continue to work as my needs were modest.

poincaredisk1y ago

I assume the first one, but it's more complicated. K8s used to have a lot of features (included very important ones) in the "beta" namespace. There are no stability guarantees there, but everyone used them anyway. Over time they graduated to the "stable" namespace, and after some transitory period they were removed from the beta namespace. This broke old deployments, when admins ignored warnings for two or three major releases.

2 more replies

nine_k1y ago

How does it compare to a simpler but not hand-crafted solution, such as dokku?

rglover1y ago

No Docker for starters. I played with Dokku a long time ago and remember it being decent at that time, but still too confusing for my skillset.

Now, I just build my app to an encrypted tarball, upload it to a secure bucket, and then create a short-lived signed URL for instances to curl the code from. From there, I just install deps on the machine and start up the app with systemd.

IMO, Docker is overkill for 99% of projects, perhaps all. One of those great ideas, poorly executed (and considering the complexity, I understand why).

_xiaz1y ago

> simple bash scripts for bootstrapping/monitoring/scaling

Damn, that's the dream right there

minkles1y ago

The first live k8s cluster upgrade anyone has to do is usually when they think "what the fuck did I get myself in to?"

It's only good for very large scale stuff. And then a lot of the time that is usually well over provisioned and could be done considerably cheaper using almost any other methodology.

The only good part of Kubernetes I have found in the last 4 years of running it in production is that you can deploy any old limping crap to it and it does its best to keep it alive which means you can spend more time writing YAML and upgrading it every 2 minutes.

mythz1y ago

We're also ignoring Kubernetes and are just using GitHub Actions, Docker Compose and SSH for our CI Deployments [1]. After a one-time setup on the Deployment Server, we can deploy new Apps with just a few GitHub Action Secrets, which then gets redeployed on every commit, including running any DB Migrations. We're currently using this to deploy and run over 50 .NET Apps across 3 Hetzner VMs.

[1] https://servicestack.net/posts/kubernetes_not_required

oldprogrammer21y ago

The amount of complexity people are introducing into their infrastructure is insane. At the end of the day, we're still just building the same CRUD web apps we were building 20 years ago. We have 50x the computation power, much faster disk, much more RAM, and much faster internet.

A pair of load-balanced web servers and a managed database, with Cloudflare out front, will get you really, really far.

akvadrako1y ago

EKS has a tab in the dashboard that warns about all the deprecated configs in your cluster, making it pretty foolproof to avoid this by checking every couple years.

hhh1y ago

Yes, and there are many open source tools that you can point at clusters to do the same. We use Kubent (Kube No Troubles) to do the same.

w0m1y ago

yeouch. sorry man. I've been running in AKS for 3-4 years now and never had an auto-upgrade come in I wasn't expecting. I have been ontop of alerts and security bulletins though, may have kept me ahead of the curve.

willvarfar1y ago

I was once on a nice family holiday and broke my resolve and did a 'quick' check of my email and found a nastygram billing reminder from a provider. On the one hand I was super-lucky I checked my mail when I did, and on the other I didn't get he holiday I needed and was lucky to not spill over and impact my family's happiness around me.

tucnak1y ago

So what is the alternative? Nomad?

llama0521y ago

So you had auto update enabled on your cluster and didn’t keep your apiversions up to date?

Sounds like user error.

rvense1y ago

One of my main criteria for evaluating a platform would be how easy it is to make user errors.

psini1y ago

To be honest the API versions have been a lot more stable recently but back in ~2019 when I first used Kube in production, basic APIs were getting deprecated left and right, 4 times a year; in the end yes the problems are "on you" but it so easy to miss and the results so disastrous for a platform whose selling points are toughness resilience and self-healing

ksajadi1y ago· 16 in thread

I’ve been building and deploying thousands of stacks on first Docker, then Mesos, then Swarm and now k8s. If I have learned one thing from it, it’s this: it’s all about the second day.

There are so many tools that make it easy to build and deploy apps to your servers (with or without containers) and all of them showcase how easy it is to go from a cloud account to a fully deploy app.

While their claims are true, what they don’t talk about is how to maintain the stack, after “reclaiming” it. Version changes, breaking changes, dependency changes and missing dependencies, disaster recovery plans, backups and restores, major shifts in requirements all add up to a large portion of your time.

If you have that kind of team, budget or problem that deserves those, then more power to you.

AnAnonyCowherd1y ago

> If you have that kind of team, budget or problem that deserves those, then more power to you.

This is the operative issue, and it drives me crazy. Companies that can afford to deploy thousands of services in the cloud definitely have the resources to develop in-house talent for hosting all of that on-prem, and saving millions per year. However, middle management in the Fortune 500 has been indoctrinated by the religion that you take your advice from consultants and push everything to third parties so that 1) you build your "kingdom" with terribly wasteful budget, and 2) you can never be blamed if something goes wrong.

As a perfect example, in my Fortune 250, we have created a whole new department to figure out what we can do with AI. Rather than spend any effort to develop in-house expertise with a new technology that MANY of us recognize could revolutionize our engineering workflow... we're buying Palatir's GenAI product, and using it to... optimize plant safety. Whatever you know about AI, it's fundamentally based on statistics, and I simply can't imagine a worse application than trying to find patterns in data that BY DEFINITION is all outliers. I literally can't even.

You smack your forehead, and wonder why the people at the top, making millions in TC, can't understand such basic things, but after years of seeing these kinds of short-sighted, wasteful, foolish decisions, you begin to understand that improving the company's abilities, and making it competitive for the future is not the point. What is the point "is an exercise left to the reader."

tempodox1y ago

> we have created a whole new department to figure out what we can do with AI.

Wow, this is literally the solution in search of a problem.

wg01y ago

This is absolutely true. I can count easily some 20+ components already.

So this is not walk in the park with two willing developers to learn k8s.

The underlying apps (Redis, ES) will have version upgrades.

Their respective operators themselves would have version upgrades.

Essential networking fabric (calico, funnel and such) would have upgrades.

The underlying kubernetes itself would have version upgrades.

The Talos Linux itself might need upgrades.

Of all the above, any single upgrade might lead to infamous controller crash loop where pod starts and dies with little to no indication as to why? And that too no ordinary pod but a crucial pod part of some operator supposed to do the housekeeping for you.

k8s is invented at Google and is more suitable in ZIRP world where money is cheap and to change the logo, you have seven designers on payroll discussing for eight months how nine different tones of brand coloring might convey ten different subliminal messages.

imiric1y ago

> The underlying apps (Redis, ES) will have version upgrades.

You would have to deal with those with or without k8s. I would argue that without it is much more painful.

> Their respective operators themselves would have version upgrades. > > Essential networking fabric (calico, funnel and such) would have upgrades. > > The underlying kubernetes itself would have version upgrades. > > The Talos Linux itself might need upgrades.

How is this different from regular system upgrades you would have to do without k8s?

K8s does add layers on top that you also have to manage, but it solves a bunch of problems in return that you would have to solve by yourself one way or another.

That essential networking fabric gives you a service mesh for free, that allows you to easily deploy, scale, load balance and manage traffic across your entire infrastructure. Building that yourself would take many person-hours and large teams to maintain, whereas k8s allows you to run this with a fraction of the effort and much smaller teams in comparison.

Oh, you don't need any of that? Great. But I would wager you'll find that the hodge podge solution you build and have to maintain years from now will take much more of your time and effort than if you had chosen an industry standard. By that point just switching would be a monumental effort.

> Of all the above, any single upgrade might lead to infamous controller crash loop where pod starts and dies with little to no indication as to why?

Failures and bugs are inevitable. Have you ever had to deal with a Linux kernel bug?

The modern stack is complex enough as it is, and while I'm not vouching for increasing it, if those additional components solve major problems for me, and they become an industry standard, then it would be foolish to go against the grain and reinvent each component once I have a need for it.

1 more reply

sgarland1y ago

Talos is an immutable OS; upgrades are painless and roll themselves back upon failure. Same thing for K8s under Talos (the only thing Talos does is run K8s).

1 more reply

benjaminwootton1y ago

The flip side of this is the cost. Managed cloud services make it faster to get live, but then you are left paying managed service providers for years.

I’ve always been a big cloud/managed service guy, but the costs are getting astronomical and I agree the buy vs build of the stack needs a re-evaluation.

Maxion1y ago

This is the balance, right? For the vast majority of web apps et. al. the cloud costs are going to be cheaper than having full-time Ops people managing an OSS stack on VPS / Bare Metal.

szundi1y ago

And what is your take on all those things that you tried? Some experience/examples would benefit us probably.

bsenftner1y ago

The thing that strikes me is: okay, two "willing developers" - but they need to be actually capable, not just "willing" but "experienced and able" and that lands you at a minimum of $100k per year per engineer. That means this system has a maintenance cost of over $16K per month, if you have to dedicate two engineers full to the maintenance, and of course following the dynamic nature of K8s and all their tooling just to stay in front of all of that.

0perator1y ago

Also, for only two k8s devops engineers in a 24h-available world, you’re gonna be running them ragged with 12h solo shifts or taking the risk of not staffing overnight. Considering most update and backup jobs kick off at midnight, that’s a huge risk.

If I were putting together a minimum-viable staffing for a 24x7 available cluster with SLAs on RPO and RTO, I’d be recommending much more than two engineers. I’d probably be recommending closer to five: one senior engineer and one junior for the 8-4 shift, a engineer for the 4-12 shift, another engineer for the 12-8 shift, and another junior who straddles the evening and night shifts. For major outages, this still requires on-call time from all of the engineers, and additional staffing may be necessary to offset overtime hours. Given your metric of roughly $8k an engineer, we’d be looking at a cool $40K/month in labour just to approach four or five 9s of availability.

oldprogrammer21y ago

Even worse, this feels like the goal was actually about reclaiming their resumes, not the stack. I expect these two guys to jump ship within a year, leaving the rest of the team trying to take care of an entire ecosystem they didn't build.

Maxion1y ago

And you may still end up with longer downtime if SHTF than if you use a managed provider.

tomwojcik1y ago

Agreed. Forgive a minor digression, but what OP wrote is my problem now. I'm looking for something like heroku's or fly's release command. I have an idea how to implement it in docker using swarm, but I can't figure out how to do that on k8s. I googled it some time ago, but all the answers were hacks.

Would someone be able to recommend an approach that's not a hack, for implementing a custom release command on k8s? Downtime is fine, but this one off job needs to run before the user facing pods are available.

psini1y ago

Look at helm charts, they have become the de facto standard for packaging/distributing/deploying/updating whole apps on Kubernetes

1 more reply

imiric1y ago

Agreed, but to be fair, those are general problems you would face with any architecture. At least with mainstream stacks you get the benefit of community support, and relying on approaches that someone else has figured out. Container-based stacks also have the benefit of homogeneizing your infrastructure, and giving you a common set of APIs and workflows to interact with.

K8s et al are not a silver bullet, but at this point they're highly stable and understood pieces of infrastructure. It's much more painful to deviate from this and build things from scratch, deluding yourself that your approach can be simpler. For trivial and experimental workloads that may be the case, but for anything that requires a bit more sophistication these tools end up saving you resources in the long run.

sedatk1y ago

> it’s all about the second day

Tangentially, I think this applies to LLMs too.

thetopher1y ago· 10 in thread

“Our basic philosophy when it comes to security is that we can trust our developers and that we can trust the private network within the cluster.”

This is not my area of expertise. Does it add a significant amount of complexity to configure this kind of system in a way that doesn’t require trusting the network? Where are the pain points?

stouset1y ago

> Our basic philosophy when it comes to security is that we can trust our developers and that we can trust the private network within the cluster.

As an infosec guy, I hate to say it but this is IMO very misguided. Insider attacks and external attacks are often indistinguishable because attackers are happy to steal developer credentials or infect their laptops with malware.

Same with trusting the private network. That’s fine and dandy until attackers are in your network, and now they have free rein because you assumed you could keep the bad people outside the walls protecting your soft, squishy insides.

jonstewart1y ago

One of the best things you can do is restrict your VPCs from accessing the internet willy-nilly outbound. When an attacker breaches you, this can keep them from downloading payloads and exfiltrating data.

2 more replies

apitman1y ago

What's your opinion on EDR in general? I find it very distasteful from a privacy perspective, but obviously it could be beneficial at scale. I just wish there was a better middle ground.

2 more replies

bigfatkitten1y ago

It's a mindset that keeps people like you and I employed in well-paying jobs.

callalex1y ago

The top pain point is that it requires setting up SSL certificate infrastructure and having to store and distribute those certs around in a secure way.

The secondary effects are entirely dependent on how your microservices talk to their dependencies. Are they already talking to some local proxy that handles load balancing and service discovery? If so, then you can bolt on ssl termination at that layer. If not, and your microservice is using dns and making http requests directly to other services, it’s a game of whack-a-mole modifying all of your software to talk to a local “sidecar”; or you have to configure every service to start doing the SSL validation which can explode in complexity when you end up dealing with a bunch of different languages and libraries.

None of it is impossible by any means, and many companies/stacks do all of this successfully, but it’s all work that doesn’t add features, can lead to performance degradation, and is a hard sell to get funding/time for because your boss’s boss almost certainly trusts the cloud provider to handle such things at their network layer unless they have very specific security requirements and knowledge.

agf1y ago

Yes, it adds an additional level of complexity to do role-based access control within k8s.

In my experience, that access control is necessary for several reasons (mistakes due to inexperience, cowboys, compliance requirements, client security questions, etc.) around 50-100 developers.

This isn't just "not zero trust", it's access to everything inside the cluster (and maybe the cluster components themselves) or access to nothing -- there is no way to grant partial access to what's running in the cluster.

jandrewrogers1y ago

This is just bad security practice. You cannot trust the internal network, so many companies have been abused following this principle. You have to allow for the possibility that your neighbors are hostile.

umvi1y ago

Implementing "Zero Trust" architectures are definitely more onerous to deal with for everyone involved (both devs and customers, if on prem). Just Google "zero trust architecture" to find examples. A lot more work (and therefore $) to setup and maintain, but also better security since now breaching network perimeter is no longer enough to pwn everything inside said network.

zymhan1y ago

It requires encrypting all network traffic, either with something like TLS, or IPSec VPN.

nilsherzig1y ago

"SSL added and removed here :^)"

subarctic1y ago· 9 in thread

I wish _I_ had a business that was successful enough to justify multiple engineers working 7 months on porting our infrastructure from heroku to kubernetes

bastawhiz1y ago

Knowing the prices and performance of Heroku (as a former customer) the effort probably paid for itself. Heroku is great for getting started but becomes untenably expensive very fast, and it's neither easy nor straightforward to break the vendor lock in when you decide to leave.

danenania1y ago

I find AWS ECS with fargate to be a nice middle ground. You still have to deal with IAM, networking, etc. but once you get that sorted it’s quite easy to auto-scale a container and make it highly available.

I’ve used kubernetes as well in the past and it certainly can do the job, but ECS is my go-to currently for a new project. Kubernetes may be better for more complex scenarios, but for a new project or startup I think having a need for kubernetes vs. something simpler like ECS would tend to indicate questionable architecture choices.

5 more replies

internetter1y ago

From their presentation, they went from $7500/m to $500/m

3 more replies

cpursley1y ago

Moving from Heroku to Render or Fly.io is very straight forward; it’s just containers.

4 more replies

antimemetics1y ago

I mean this is what they recommend:

- Your current cloud / PaaS costs are north of $5,000/month - You have at least two developers who are into the idea of running Kubernetes and their own infrastructure and are willing to spend some time learning how to do so

So you will spend 150k+/year (2 senior full stake eng salaries in EU - can be much higher, esp for people up to the task) to save 60k+/y in infra costs?

Does not compute for me - is the lock-in that bad?

I understand it for very small/simple use cases - but then do you need k8s at all?

It feels like the ones who will benefit the most is orgs who spend much more on cloud costs - but they need SLAs, compliance and a dozen other enterprisy things.

So I struggle to understand who would benefit from this stack reclaim.

1 more reply

efilife1y ago

Fyi, we use asterisks (*) for emphasis on HN

willvarfar1y ago

underscores around italics and asterisk around strong/bold was an informal convention on bbs, irc and forums way before atx/markdown.

1 more reply

Kiro1y ago

Different thing. Using visible _ is a conscious choice.

1 more reply

komali21y ago

Who's "we?"

2 more replies

sph1y ago· 9 in thread

"Join the Discord server"? Who's the audience of this project?

mre1y ago

Genuinely curious, what's wrong with that? Did you expect a different platform like Slack?

callalex1y ago

Locking knowledge behind something that isn’t publicly searchable or archivable works fine in the short term but what happens when Discord/Slack/whatever gears up for an IPO and limits all chat history to 1 week unless you pay up (oh and now you have a bunch of valuable knowledge stored up their with no migration tool so your only options are “pay up” or lose the knowledge).

2 more replies

Gormo1y ago

There's a whole FOSS ecosystem of chat/collaboration applications, like Mattermost and Zulip; there's Matrix for a federated solution, and tried-and-true options like IRC.

For something called "Reclaim the Stack" to lock discussion into someone else's proprietary walled garden is quite ironic.

fragmede1y ago

it would be better at the bottom of the first documentation page, after the reader has a better idea of what this is

tacker20001y ago

Also noticed this. Everytime I see a project using discord as main communication tool it makes me think about the “fitness” of the project in the long run.

Discord is NOT a benefit. Its not publicly searchable and the chat format is just not suitable to a knowledge base or support based format.

Forums are much better in that regard.

KronisLV1y ago

> Discord is NOT a benefit. Its not publicly searchable and the chat format is just not suitable to a knowledge base or support based format.

I don't think people who choose Discord necessarily care about that. Discord is where the people are, so that's where they go. It also costs close to nothing to setup a server and since it has a lower barrier of entry than hosting your own forum, it's deemed good enough.

That said, modern forum software like Discourse https://www.discourse.org/ or Flarum https://flarum.org/ can be pretty good, though I still miss phpBB.

1 more reply

mrits1y ago

People that don't like wasting money?

hobs1y ago

Not capturing the information and being able to use it in the future is a huge opportunity cost, and idling on discord pays no bills.

Gormo1y ago

Wasting money on... better solutions that are also free?

1 more reply

andrewstuart1y ago· 9 in thread

How can a NewsDesk application need kubernetes?

Wouldn't a single machine and a backup machine do the job?

dbackeus1y ago

Because it's a fully featured public relations platform, not just a "newsdesk" (though that's what it started as some 20 years ago).

We have a main monolithic application at the core. But there are plenty of ancillary applications used to run the various parts of our application (eg. analytics, media monitoring, social media monitoring, journalist databases, media delivery, LLM based content sugestion etc).

Then we have at least one staging deployment for each app (the monolith has multiple). All permutations of apps and environments reach about 50 applications deployed on the platform, all with their own highly available databases (Postgres, Redis, ElasticSearch and soon ClickHouse).

bluepizza1y ago

Most simple applications that use k8s are doing it for autoscaling or no downtime continuous deployment (or both).

wordofx1y ago

So basically 2 things you don’t need k8s to solve?

2 more replies

briandear1y ago

Is business running a complete application stack on a single machine?

notpushkin1y ago

A lot of businesses don’t need more than a couple machines (and can get away with one, but it’s not good for redundancy).

mrweasel1y ago

Frequently yes, normally I'd say that the database server is on a separate machine, but otherwise yes.

I've seen companies run a MiniKube installation on a single server and run their applications that way.

liveoneggs1y ago

my vps reboots every 18 months or so..

andrewstuart1y ago

I just looked it up - its because they run Ruby On Rails.

zymhan1y ago

and so what?

1 more reply

fourseventy1y ago· 8 in thread

In my experience you can get pretty far with just a handful of vms and some bash scripts. At least double digit million ARR. Less is more when it comes to devops tooling imo.

lolinder1y ago

> you can get pretty far with just a handful of vms and some bash scripts. At least double digit million ARR.

Using ARR as the measurement for how far you can scale devops practices is weird to me. Double-digit million ARR might be a few hundred accounts if you're doing B2B, and double-digit million MAUs if you're doing an ad-funded social platform. Depending on how much software is involved your product could be built by a team of anywhere from 1-50 developers.

If you're a one-developer B2B company handling 1-3 requests per second you wouldn't even need more than one VM except maybe as redundancy. But if you're the fifty-developer company that's building something beyond simple CRUD, there are a lot of perks that come with a full-fledged control plane that would almost certainly be worth the added cost and complexity.

davedx1y ago

> there are a lot of perks that come with a full-fledged control plane that would almost certainly be worth the added cost and complexity.

Such as?

Logging is more complicated with multi container microservice deployments. Deploying is more complicated. Debugging and error tracing is more difficult. What are the perks?

1 more reply

Maxion1y ago

I used to work at a Fintech company where we had around 1-20k concurrent active users, monthly around 2 million active users. I forget the RPS, but it was maybe around 200-1000 normally? We ran on bare metal, bash scripts, not a container in sight. It was spaghetti, granted, but it worked surprisingly well.

marcosdumay1y ago

> double-digit million MAUs

I was about to make a similar point, but you made the math, and it's holding-up for the GP's side.

You can push vms and direct to ssh synchronization up to double-digit million MAU (unless you are using stuff like persistent web-sockets). It won't be pretty, but you can get that far.

1 more reply

sanderjd1y ago

Of course you can get away with that if your metric is revenue. (I think Blippi makes about that much with, I suspect, nary a VM in sight!

The question is what you're doing with your infrastructure, not how much revenue you're making. Some things have higher return to "devops" and others have less.

kevin_nisbet1y ago

I agree, this is an incredibly valid approach for some companies and startups. If you benefit by being frugal and are doing something that doesn't need incredible availability, a rack of servers in a colo doesn't cost much and you can take it pretty far without a huge amount of effort.

cloudking1y ago

+1 or just use App Engine, deploy your app and scale

cglace1y ago

App engine deploys are soooo slow. I liked cloud run a lot more.

appplication1y ago· 7 in thread

This sounds great, I’ll be building our prod infra stack and deploying to cloud for the first time here in the next few weeks, so this is timely.

It’s nice seeing some OSS-based tooling around k8s. I know it’s a favorite refrain that “k8s is unnecessary/too complex, you don’t need it” for many folks getting started with their deployments, but I already know and use it in my day job, so it feels like a pretty natural choice.

notpushkin1y ago

I really hated Kubernetes at first because the tooling is so complicated. However, having worked with raw Docker API and looking into the k8s counterparts, I’m starting to appreciate it a lot more.

(But it still needs more accessible tooling! Kompose is a good start though: https://kompose.io/)

dbackeus1y ago

Feel free to join the RtS discord if you want to bounce ideas for your upcoming infra

briandear1y ago

The K8s is unnecessary meme is perpetuated by people that don’t understand it.

actionfromafar1y ago

True, but also, sometimes it’s not needed.

1 more reply

freeopinion1y ago

If they don't understand it but still get their jobs done...

Tractors are also unnecessary. Plenty of people grow tomatos off their balcony without tractors.

If somebody insists on growing 40 acres of tomatos without a tractor because tractors aren't necessary, why argue with them? If they try to force you to not use a tractor, that's different.

sph1y ago

k8s is relatively straightforward, it's the ecosystem around it that is total bullcrap, because you won't only run k8s, you will also run Helm, a templating language or an ad-hoc mess of scripts, a CNI, a CI/CD system, operators, sidecars, etc. and every one of these is an over-engineered buggy mess with half a dozen hyped alternatives that are in alpha state with their own set of bugs.

How Kubernetes works is pretty simple, but administering it is living a life of constant analysis paralysis and churn and hype cycles. It is a world built by companies that have something to sell you.

okasaki1y ago

Just had an incident call last week with 20+ engineers on zoom debugging a prod k8s cluster for 5 hours.

deisteve1y ago· 5 in thread

i got excited until i saw this was kubernetes. you most certainly do not need to add that layer of complexity.

If I can serve 3 million users / month on a $40/month VPS with just Coolify, Postgres, Nginx, Django Gunicorn without Redis, RabbitMQ why should I use Kubernetes?

dbackeus1y ago

Coolify does look nice.

But I don't believe it supports HA deployments of Postgres with automated failover / 0 downtime upgrades etc?

Do they even have built in backup support? (a doc exists but appears empty: https://coolify.io/docs/knowledge-base/database-backups)

What makes you feel that Coolify is significantly less complex than Kubernetes?

mrweasel1y ago

> why should I use Kubernetes

You shouldn't, but people have started to view Kubernetes as a deployment tool. Kubernetes makes sense when you start having bare metal workers, or high number of services (micro-services). You need to have a pretty dynamic workload for Kubernetes to result in any cost saving on the operations side. There might be a cost saving if it's easier to deploy your services, but I don't see that being greater than the cost of maintaining and debugging a broken Kubernetes cluster in most case.

The majority of uses does not require Kubernetes. The majority of users who think they NEED Kubernetes are wrong. That's not to say that you shouldn't use it, if you believe you get some benefit, it's just not your cheapest option.

itsthecourier1y ago

Got a bill from usd10k to usd0.5k a month by moving away from gcp to Kamal in ovh

And 30% less latency

deisteve1y ago

thats 95% in savings!!!! bet you can squueze more with hetzner

to ppl who disagree,

what business justifies 18x'ing your operating costs?

9.5k USD can get you 3 senior engineers in Canada. 9 in India.

1 more reply

Kiro1y ago

Why do you need Coolify?

b_shulha1y ago· 5 in thread

Who are your target audience? There are so many components in this system, so it would require a dev-ops team member just to keep it healthy.

What are the advantages over the (free) managed k8s provided by DigitalOcean?

---

Gosh, I'm so happy I was able to jump of the k8s hype train. This is not something SMBs should be using. Now I happily manage my fleet of services without large infra overhead via my own paas over Docker Swarm. :)

dbackeus1y ago

> Who are your target audience?

Anyone looking for a PaaS alternative matching or exceeding the UX of Heroku.

The "is it for you" section of our Introduction may give a better idea: https://reclaim-the-stack.com/docs/kubernetes-platform/intro...

> What are the advantages over the (free) managed k8s provided by DigitalOcean?

You can run the platform on top of any Kubernetes deployment. So you can run it on top of DigitalOcean kubernetes if you wish. But you'll get more bang for the buck using Hetzner dedicated servers.

b_shulha1y ago

I've read the Introduction, but still have no idea why I need to use this platform instead of a managed k8s provided by DO.

It probably makes sense to put a few words on the "components" as well, as it seems to be the main selling point and not the privacy/GDPR concerns.

b_shulha1y ago

Oh, thanks for asking. ;)

It is a fair source (future Apache 2.0 License) PaaS. I provide a cloud option if you want to manage less and get extra features (soon - included backup space, uptime monitoring from multiple locations, etc) and, of course, you are free to self-host it for free and without any limitations by using a single installation script. ;)

https://github.com/ptah-sh/ptah-server

But anyway, I'm really curious to know the answers to the questions I have posted above. Thanks!

KronisLV1y ago

> Gosh, I'm so happy I was able to jump of the k8s hype train. This is not something SMBs should be using. Now I happily manage my fleet of services without large infra overhead via my own paas over Docker Swarm. :)

I mean, I also use Docker Swarm and it's pretty good, especially with Portainer.

To me, the logical order of tools goes with scale a bit like this: Docker Compose --> Docker Swarm --> Hashicorp Nomad / Kubernetes

(with maybe Podman variety of tools where needed)

I've yet to see a company that really needs the latter group of options, but maybe that's because I work in a country that's on the smaller side of things.

All that being said, however, both Nomad and some K8s distributions like K3s https://k3s.io/ can be a fairly okay experience nowadays. It's just that it's also easy to end up with more complexity than you need. I wonder if it's going to be the meme about going full circle and me eventually just using shared hosting with PHP or something again, though so far containers feel like the "right" choice for shipping things reasonably quickly, while being in control of how resources are distributed.

b_shulha1y ago

While k3s make k8s easier for sure, it still comes with lots of complexity on board just because it is k8s. :)

Nowaday I prefer simple tooling over "flexible" for my needs.

Enterprises, however, should stick to k8s-alike solutions, as there are just too many variables everywhere: starting from security, and ending the software architecture itself.

strzibny1y ago· 4 in thread

It's good to see new projects. However most people shouldn't start with Kubernetes at all. If you don't need autoscaling, give Kamal[0] a go. It's the tool 37signals made to leave Kubernetes and cloud. Works super well with simple VMs. I also wrote a handbook[1] to get people started.

[0] https://kamal-deploy.org [1] https://kamalmanual.com/handbook/

dbackeus1y ago

(Reclaim the Stack creator here)

We don't do autoscaling.

The main reason for Kubernetes for us was automation of monitoring / logs / alerting and highly available database deployments.

37signals has a dedicated operations team with more than 10 people. We have 0 dedicated operations people. We would not have been able to run our product with Kamal given our four nines uptime target.

(that said, I do like Kamal, especially v2 seems to smooth out some edges, and I'm all for simple single server deployments)

leohonexus1y ago

Bought both your books, they are awesome :)

mplewis1y ago

I’m not going to trust a project like this – made by and for one company – with production workloads.

rcaught1y ago

hahaha, do you even realize what else this company makes?

Summerbud1y ago· 4 in thread

> The results were a 90% reduction in costs and a 30% improvement in performance.

I am in a company with dedicated infra team and my CEO is a infra enthusiastic. He use terraform and k8s to build the company's infra. But the results are.

- Every deployment take days, in my experience, I need to woke for 24 hr streak to make it work. - The infra is complicated to a level that quite hard to adjust

And benefits wise, I can't even think about it. We don't have many users so the claimed scalability is not even there.

I will strongly argue startup should not touch k8s until you have fair user base and retention.

It's a nightmare to work with.

raziel2p1y ago

sounds like your CEO just isn't very good at setting up infra.

Summerbud1y ago

Maybe, that is one of the possibilities in my mind too.

cultofmetatron1y ago

DAYS??? our infra takes 10 min usually with up to 45 min if we're doing some postgres maintenance stuff. People in a work context should stick to what they are good at.

tryauuum1y ago

...but why? How many services the deployment requires?

PaulHoule1y ago· 2 in thread

What about “the rest of us” who don’t have time for Kube?

notpushkin1y ago

If you know how to write a docker-compose.yml – Docker Swarm to the rescue! I’m making a nice PaaS-style thing on top of it: https://lunni.dev/

You can also use Kubernetes with compose files (e.g. with Kompose [1]; I plan to add support to Lunni, too).

[1]: https://kompose.io/

tacker20001y ago

Im using docker compose on every project I have, and it works fine.

Of course, I dont have millions of users, but until then this is enough for me.

fragmede1y ago· 2 in thread

having your tool be a single letter, k, seems rather presumptuous.

rjbwork1y ago

Especially given K is already the name of an APL derivative.

dbackeus1y ago

I suppose it is. But no actual users of the platform has had any complaints about it.

AbuAssar1y ago· 1 in thread

How does this compare to dokku (https://dokku.com/)?

dbackeus1y ago

Main difference is that Dokku is a simple single server platform, geared mostly toward hobby projects.

Reclaim the Stack provides a fully highly available multi node platform to host large scale SaaS applications aiming for four nines of uptime.

evantahler1y ago· 1 in thread

Porter (https://www.porter.run/) is a great product in the same vein (e.g. turn K8s into a dev-friendly Heroku-like PASS). How does this compare?

mikeortman1y ago

I think the very concept of this is to open source a common stack, instead of relying on a middleman like Porter, which also costs a TON of money at business tier

sciurus1y ago· 1 in thread

> Replicas are used for high availability only, not load balancing

(From https://reclaim-the-stack.com/docs/platform-components/ingre...)

An I reading this right that they built a k8s-based platform where by default they can't horizontally scale applications?

This seems like a lot of complexity to develop and maintain if they're running applications that don't even need that.

dbackeus1y ago

This documentation only pertains to the Cloudflared ingress servers, which can handle orders of magnitude more traffic than we actually get. So we have not had any need to look into load balancing of this part of the infrastructure. Our actual application servers can of course be horizontally scaled.

That said, there is some kind of balancing across multiple cloudflared replicas. But when we measured the traffic Cloudflare sent ~80% of traffic to just one of the available replicas.

We haven't looked into what the actual algorithm is. It may well be that load starts getting better distributed if we were to start hitting the upper limits of a single replica.

Or it may be by design that the load balancing is crappy to provide incentive for Cloudflare customers to buy their dedicated Load Balancing product (https://developers.cloudflare.com/load-balancing/).

pton_xd1y ago· 1 in thread

"The results were a 90% reduction in costs and a 30% improvement in performance."

What's the scale of this service? How many machines are we talking here?

internetter1y ago

Went from $~7500 to $520/m iirc from the presentation

pwmtr1y ago· 1 in thread

Definitely interesting material. I realized, especially in last few years, there is an increased interest on moving away from propriety clouds/PaaS to K8s or even to bare metal, primarily driven by high prices and also interest of having more control.

At Ubicloud, we are attacking the same problem, though from a different angle. We are building an open-source alternative to AWS. You can host it yourself or use our managed services (which are 3x-10x more affordable than comparable services). We already built some primitives such as VMs, PostgreSQL, private networking, load balancers and also working on K8s.

I have a question to HN crowd; which primitives are required to run your workloads? It seems the OP's list consists of Postgres, Redis, Elasticsearch, Secret Manager, Logging/Monitoring, Ingress and Service Mesh. I wonder if this is representative of typical requirements to run HN crowd's workloads.

evertheylen1y ago

Quite simple, I want to submit a Docker image, and have it accept HTTP requests at a certain domain, with easy horizontal/vertical scaling. I'm sure your Elastic Compute product is nice but I don't want to set it up myself (let alone run k8s on it). Quite like fly.io.

PS: I like what you guys are doing, I'd subscribe to your mailing list if you had one! :)

thih91y ago· 1 in thread

> fully open source stack*. *) Except for Cloudflare

Are there plans to address that too long term?

dbackeus1y ago

Not from our point of view since Cloudflare's DDOS production and CDN is a crucial part of our architecture.

That said, switching out cloudflared for a more traditional ingress like nginx etc would be straight forward. No parts of the RtS tooling as actually dependent on using Cloudflare for ingress in particular.

notpushkin1y ago

It looks like a nice Kubernetes setup! But I don’t see how this is comparable to something like Heroku – the complexity is way higher from what I see.

If you’re looking for something simpler, try https://dokku.com/ (the OG self-hosted Heroku) or https://lunni.dev/ (which I’ve been working on for a while, with a docker-compose based workflow instead). (I've also heard good things about coolify.io!)

aliasxneo1y ago

Since there are so many mixed comments here, I'll share my experience. Our startup started on day one with Kubernetes. It took me about six weeks to write the respective Terraform and manifests and combine them into a homogenous system. It's been smooth sailing for almost two years now.

I'm starting to suspect the wide range of experiences has to do with engineering decisions. Nowadays, it's almost trivial to over-engineer a Kubernetes setup. In fact, with platform engineering becoming all the rage these days, I can't help but notice how over-engineered most reference architectures are for your average mid-sized company. Of course, that's probably by design (Humanitec sure enjoys the money), but it's all completely optional. I intentionally started with a dead-simple EKS setup: flat VPC with no crazy networking, simple EBS volumes for persistence, an ALB on the edge to cover ingress, and External Secrets to sync from AWS Secrets Manager. No service mesh, no fancy BPF shenanigans, just a cluster so simple that replicating to multiple environments was trivial.

The great part is that because we've had such excellent stability, I've been able to slowly build out a custom platform that abstracts what little complexity there was (mostly around writing manifests). I'm not suggesting Kubernetes is for everyone, but the hate it tends to get on HN still continues to make me scratch my head to this day.

airstrike1y ago

> We spent 7 months building a Kubernetes based platform to replace Heroku for our SaaS product at mynewsdesk.com. The results were a 90% reduction in costs and a 30% improvement in performance.

I don't mean to sound dismissive, but maybe the problem is just that Heroku is/was slow and expensive? Meaning this isn't necessarily the right or quote-unquote "best" approach to reclaiming the stack

mikeortman1y ago

I'm glad we are starting to lean into cloud-agnostic or building back the on-prem/dedicated systems again.

zug_zug1y ago

Seems like a cool premise. Though I guess people building things always want to convince you they are worth-it (sort of a conflict-of-interest), would like to read an unbiased 7-day migration to this.

noop_joe1y ago

Heroku and Reclaim are far from the only two options available. The appropriate choice depends entirely on the team's available expertise and the demands of the applications under development.

There's a lot of disagreements pitting one solution against another. Even if one hosting solution were better than another, the problem is there are SO MANY solutions that exist on so many axis of tradeoffs, it's determine an appropriate solution (heroku, reclaim, etc) without consideration to its application and context of use.

Heroku has all sorts of issues: super expensive, limited functionality, but if it happens to be what a developer team knows and works for their needs, heroku could save them lots of money even considering the high cost.

The same is true for reclaim. _If_ you're familiar with all of the tooling, you could host an application with more functionality for less money than heroku.

chrisweekly1y ago

This looks great! Thank you for sharing, @dustedcodes. I might set up a playground to gain more hands-on experience w/ the relevant significant parts (k8s, argocd, talos) all of which have been on my radar for some time... Also, the docs look great. I love the Architecture Decision Records (bullet-point pros/cons/context)...

Retr0id1y ago

Based on the title alone, I thought this was going to be people up in arms about -fomit-frame-pointer being used by distros

hintymad1y ago

A trajectory question: Is there an acceptable solution to federate k8s clusters, or is there a such need? One thing that EC2 was really powerful is that a company can practically create as many clusters (ASGs) of as many nodes as needed, while k8s by default has this scale limit of 5000 nodes or so. I guess 5000 nodes will far from being enough for a large company that offers a single compute platform to its employees.

jonstewart1y ago

> Having started with Heroku, we have maintained a similar level of security

Remember 2022? https://www.bleepingcomputer.com/news/security/heroku-admits...

Havoc1y ago

Toying with self hosted k8s at home has taught me that it it’s the infra equivalent of happy path coding.

Works grand until it blows up in your face for non obvious reasons

That’s definitely mostly a skill issue on my end but still would make me very wary betting a startup on it

kh_hk1y ago

> We spent 7 months building a Kubernetes based platform to replace Heroku for our SaaS product at mynewsdesk.com.

I thought this was either a joke I was missing, or a rant about Kubernetes. It turned out it was neither, and now I am confused.

thesurlydev1y ago

I was excited about this title until I read it's just another thing on top of Kubernetes. To me, Kubernetes is part of the problem. Can we reduce the complexity that Kubernetes brings and still have nice things?

est1y ago

> We spent 7 months building a Kubernetes based platform to replace Heroku for our SaaS product

And heroku is based on LXC containers. I'd say it's almost the same thing.

mvkel1y ago

> 90% reduction in costs

Curious what accounts are being attributed to said costs.

Many new maintenance-related lines will be added, with only one (subscription) removed.

seungwoolee5181y ago

Most of the software should work Out-Of-The-Box, but the real problem is coming from hardware.

GaryNumanVevo1y ago

Potential irony, this site isn't loading for me

j / k navigate · click thread line to collapse

321 comments

181 comments · 38 top-level

jusomg1y ago· 24 in thread

Of course you reduced 90% of the cost. Most of these costs don't come from the software, but from the people and automation maintaining it.

mlinhares1y ago

Anything you don't know about managing these systems can be learned asking chatgpt :P

re-thc1y ago

> and that time would have been better invested in the product instead of wasting time figuring out how these tools work

Is it really time wasted? People often go into resume building mode and do all kinds of wacky things regardless. Perhaps this just helps scratch that itch.

1 more reply

ljm1y ago

All by design, really, because at that point you're not part of an engineering team you're a code monkey operating in service of growth metrics.

Diederich1y ago

> ... I remember I did the same when I was in 10 people startups and it required A LOT of work to keep all these things running...

Honest question: how long ago was that? I stepped away from that ecosystem four or so years ago. Perhaps ease of use has substantially improved?

ugh1231y ago

> you also removed monitoring of the platform

You don't think they have any monitoring within Kubernetes?

I imagine they have more monitoring capabilities now than they did with Heroku.

almost1y ago

dzikimarian1y ago

I assume you reference my comment.

The reason I think parent comment is FUD isn't because I don't acknowledge tradeoffs (they are very real).

It's because parent comment implies that people behind "reclaim the stack" didn't account for the monitoring, people's cost etc.

Obviously any reasonable person making that decision includes it into calculation. Obviously nobody sane throws entire monitoring out of the window for savings.

Accounting for all of these it can be still viable and significantly cheaper to run own infra. Especially if you operate outside of the US and you're able to eat an initial investment.

1 more reply

kmacdough1y ago

Perhaps this stack will open that opportunity to less equipped startups, but I've found few open source "drop-in replacements" to be truly drop-in. And I've never found k3 to be dead simple.

dzikimarian1y ago

Also AWS is also, complex, also requires configuration and also generates alerts in the middle of the night.

It's still a lot cheaper than managed service.

jusomg1y ago

> Of course you have more maintenance on on-prem, but typical k8s update is maybe a few hours of work, when you know what you are doing.

You just mentioned one dimension of what I described, and "when you know what you are doing" is doing a lot of the heavy lifting in your argument.

> Also AWS is also, complex, also requires configuration and also generates alerts in the middle of the night.

I'm confused. So we are on agreement there?

I feel you might be confusing my point with an on-prem vs AWS discussion, and that's not it.

1 more reply

filleokus1y ago

You can also trade operational complexity for cash via support contracts and/or enterprise solutions (like just throwing money at Hitachi for storage rather than trying to keep Ceph alive).

1 more reply

tinco1y ago

1 more reply

dbackeus1y ago

Original creator and maintainer of Reclaim the Stack here.

> you also removed monitoring of the platform

No we did not: Monitoring: https://reclaim-the-stack.com/docs/platform-components/monit...

Log aggregation: https://reclaim-the-stack.com/docs/platform-components/log-a...

> Who/What is going to be doing that on this new platform and how much does that cost?

> Now you need to maintain k8s, postgresql, elasticsearch, redis, secret managements, OSs, storage... These are complex systems that require people understanding how they internally work

Thanks to Talos Linux (https://www.talos.dev/), maintaining K8s has been a non issue.

Running databases via operators has been a non issue, apart from Redis.

Secret management via sealed secrets + CLI tooling has been a non issue (https://reclaim-the-stack.com/docs/platform-components/secre...)

> Who is going to upgrade kubernetes when they release a new version that has breaking changes?

> What happens when ElasticSearch decides to splitbrain and your search stops working?

It's not like using ElasticCloud protects against these things in any meaningfully different way. However, the feedback loop of contacting support would be slower.

> When the DB goes down or you need to set up replication?

Operators handle failovers. If we would lose all replicas in a major disaster event we would have to recover from off site backups. Same rules would apply for managed databases.

> What is monitoring replication lag?

For Postgres, which is our only critical data source. Replication lag monitoring + alerting is built into the operator.

It should be straight forward to add this for Redis and ElasticSearch as well.

> Or even simply things like disks being close to full?

Disk space monitoring and alerting is built into our monitoring stack.

That said, we're not trying to make the point that EVERYONE should Reclaim the Stack. We documented our thoughts about it here: https://reclaim-the-stack.com/docs/kubernetes-platform/intro...

troupo1y ago

Since you're the original creator, can you open the site of your product, and find the link to your project that you open sourced?

- Front page links to docs and disord.

- First page of docs only has a link to discord.

- Installation references a "get started" repo that is... somehow also the main repo, not just "get started"?

1 more reply

swat5351y ago

This also assumes your infra doesn't grow and requires more maintenance or you have to deal with other issues.

Focusing on building features and generating revenue is much valuable than wasting precious engineering time maintain stacks.

This is hardly a "win" in my book.

2 more replies

ozgune1y ago

Hey there, this is a comprehensive and informative reply!

I had two questions just to learn more.

* What has been your experience with using local NVMes with K8s? It feels like K8s has some assumptions around volume persistence, so I'm curious if these impacted you at all in production.

* How does 'Reclaim the Stack' compare to Kamal? Was migrating off of Heroku your primary motivation for building 'Reclaim the Stack'?

1 more reply

cryptonector1y ago

Who says they reduced costs by cutting staff? They could instead have scaled their staff better.

johnnyanmac1y ago

>Who/What is going to be doing that on this new platform and how much does that cost?

And yeah, it isn't apples to apples. Maybe you are in a low CoL area and can find a decent DevOps for 80-100k. Maybe you're in SF and any extra dev will be 250k. It'll vary immensely on cost.

Nextgrid1y ago

The_Colonel1y ago

1 more reply

HolyLampshade1y ago

Speaking of the exchanges (at least the sanely operated ones), there’s a reason the stack is simplified compared to most of what is being described here.

You generally want things to be as simple and streamlined as possible so when something does pop (and it will) your mean time to resolution is inside of a minute.

almost1y ago

Not for Heroku, they're absolute garbage these days, but definitely for a better run PaaS.

1 more reply

gspencley1y ago

Good engineering isn't just about finding the simplest or cheapest solution. It is about understanding the business requirements and finding the right solution for the business.

1 more reply

matus_congrady1y ago

Since DHH has been promoting the 'do-it-yourself' approach, many people have fallen for it.

You're asking the right questions that only a few people know they need answers to.

Example services include https://stacktape.com, https://flightcontrol.dev, and https://www.withcoherence.com.

I'm also working on a PaaS comparison site at https://paascout.io.

Disclosure: I am a founder of Stacktape.

rglover1y ago· 23 in thread

I made the mistake of falling for the k8s hype a few years back for running all of my indie hacker businesses.

Big mistake. Overnight, the cluster config files I used were no longer supported by the k8s version DigitalOcean auto upgraded my cluster to and _boom_. Every single business was offline.

Made the switch to some simple bash scripts for bootstrapping/monitoring/scaling and systemd for starting/restarting apps (nodejs). I'll never look back.

cedws1y ago

Weird how defensive people get about K8S when you say stuff like this. It’s like they’re desperately trying to convince you that you really do need all that complexity.

rollcat1y ago

My occasional moral dilemma is idle power usage of overprovisioned resources, but we've found some interesting things to throw at idle hardware to ease our conscience about it.

1 more reply

alex_lav1y ago

I think it's two types of defensiveness.

1. Shovel salesman insisting all "real" gold miners use their shovels

2. Those that have already acquired shovels not wanting their purchase to be mocked/have been made in vain.

Neither are grounded in reality. Why people believe their tiny applications require the same tech that Google invented to help manage their (massive) scale is beyond me.

0perator1y ago

Most do not, but they still want all the toys that developers are building for “the cloud”.

poincaredisk1y ago

I use k8s for the last uhh 5 years and this never happened to me. In my case, because I self-host my cluster, do no unexpected upgrades. But I agree that maintaining k8s cluster takes some work.

theptip1y ago

In the 2015-2019 period there were quite a few API improvements involving deprecating old APIs, it’s much more stable/boring now. (Eg TPR -> CRD was the big one for many cluster plugins)

eddd-ddde1y ago

So either digital ocean auto updates breaking versions. Or k8s doesn't do versioning correctly. Both very bad.

Which was it?

rglover1y ago

Technically both, but more so the former.

As a solo founder, I found what worked at the time and assumed (foolishly, in retrospect) that it would just continue to work as my needs were modest.

poincaredisk1y ago

2 more replies

nine_k1y ago

How does it compare to a simpler but not hand-crafted solution, such as dokku?

rglover1y ago

No Docker for starters. I played with Dokku a long time ago and remember it being decent at that time, but still too confusing for my skillset.

IMO, Docker is overkill for 99% of projects, perhaps all. One of those great ideas, poorly executed (and considering the complexity, I understand why).

_xiaz1y ago

> simple bash scripts for bootstrapping/monitoring/scaling

Damn, that's the dream right there

minkles1y ago

The first live k8s cluster upgrade anyone has to do is usually when they think "what the fuck did I get myself in to?"

It's only good for very large scale stuff. And then a lot of the time that is usually well over provisioned and could be done considerably cheaper using almost any other methodology.

mythz1y ago

[1] https://servicestack.net/posts/kubernetes_not_required

oldprogrammer21y ago

A pair of load-balanced web servers and a managed database, with Cloudflare out front, will get you really, really far.

akvadrako1y ago

EKS has a tab in the dashboard that warns about all the deprecated configs in your cluster, making it pretty foolproof to avoid this by checking every couple years.

hhh1y ago

Yes, and there are many open source tools that you can point at clusters to do the same. We use Kubent (Kube No Troubles) to do the same.

w0m1y ago

willvarfar1y ago

tucnak1y ago

So what is the alternative? Nomad?

llama0521y ago

So you had auto update enabled on your cluster and didn’t keep your apiversions up to date?

Sounds like user error.

rvense1y ago

One of my main criteria for evaluating a platform would be how easy it is to make user errors.

psini1y ago

ksajadi1y ago· 16 in thread

I’ve been building and deploying thousands of stacks on first Docker, then Mesos, then Swarm and now k8s. If I have learned one thing from it, it’s this: it’s all about the second day.

If you have that kind of team, budget or problem that deserves those, then more power to you.

AnAnonyCowherd1y ago

> If you have that kind of team, budget or problem that deserves those, then more power to you.

tempodox1y ago

> we have created a whole new department to figure out what we can do with AI.

Wow, this is literally the solution in search of a problem.

wg01y ago

This is absolutely true. I can count easily some 20+ components already.

So this is not walk in the park with two willing developers to learn k8s.

The underlying apps (Redis, ES) will have version upgrades.

Their respective operators themselves would have version upgrades.

Essential networking fabric (calico, funnel and such) would have upgrades.

The underlying kubernetes itself would have version upgrades.

The Talos Linux itself might need upgrades.

imiric1y ago

> The underlying apps (Redis, ES) will have version upgrades.

You would have to deal with those with or without k8s. I would argue that without it is much more painful.

How is this different from regular system upgrades you would have to do without k8s?

K8s does add layers on top that you also have to manage, but it solves a bunch of problems in return that you would have to solve by yourself one way or another.

> Of all the above, any single upgrade might lead to infamous controller crash loop where pod starts and dies with little to no indication as to why?

Failures and bugs are inevitable. Have you ever had to deal with a Linux kernel bug?

1 more reply

sgarland1y ago

Talos is an immutable OS; upgrades are painless and roll themselves back upon failure. Same thing for K8s under Talos (the only thing Talos does is run K8s).

1 more reply

benjaminwootton1y ago

The flip side of this is the cost. Managed cloud services make it faster to get live, but then you are left paying managed service providers for years.

I’ve always been a big cloud/managed service guy, but the costs are getting astronomical and I agree the buy vs build of the stack needs a re-evaluation.

Maxion1y ago

This is the balance, right? For the vast majority of web apps et. al. the cloud costs are going to be cheaper than having full-time Ops people managing an OSS stack on VPS / Bare Metal.

szundi1y ago

And what is your take on all those things that you tried? Some experience/examples would benefit us probably.

bsenftner1y ago

0perator1y ago

oldprogrammer21y ago

Maxion1y ago

And you may still end up with longer downtime if SHTF than if you use a managed provider.

tomwojcik1y ago

psini1y ago

Look at helm charts, they have become the de facto standard for packaging/distributing/deploying/updating whole apps on Kubernetes

1 more reply

imiric1y ago

sedatk1y ago

> it’s all about the second day

Tangentially, I think this applies to LLMs too.

thetopher1y ago· 10 in thread

“Our basic philosophy when it comes to security is that we can trust our developers and that we can trust the private network within the cluster.”

This is not my area of expertise. Does it add a significant amount of complexity to configure this kind of system in a way that doesn’t require trusting the network? Where are the pain points?

stouset1y ago

> Our basic philosophy when it comes to security is that we can trust our developers and that we can trust the private network within the cluster.

jonstewart1y ago

2 more replies

apitman1y ago

What's your opinion on EDR in general? I find it very distasteful from a privacy perspective, but obviously it could be beneficial at scale. I just wish there was a better middle ground.

2 more replies

bigfatkitten1y ago

It's a mindset that keeps people like you and I employed in well-paying jobs.

callalex1y ago

The top pain point is that it requires setting up SSL certificate infrastructure and having to store and distribute those certs around in a secure way.

agf1y ago

Yes, it adds an additional level of complexity to do role-based access control within k8s.

In my experience, that access control is necessary for several reasons (mistakes due to inexperience, cowboys, compliance requirements, client security questions, etc.) around 50-100 developers.

jandrewrogers1y ago

umvi1y ago

zymhan1y ago

It requires encrypting all network traffic, either with something like TLS, or IPSec VPN.

nilsherzig1y ago

"SSL added and removed here :^)"

subarctic1y ago· 9 in thread

I wish _I_ had a business that was successful enough to justify multiple engineers working 7 months on porting our infrastructure from heroku to kubernetes

bastawhiz1y ago

danenania1y ago

5 more replies

internetter1y ago

From their presentation, they went from $7500/m to $500/m

3 more replies

cpursley1y ago

Moving from Heroku to Render or Fly.io is very straight forward; it’s just containers.

4 more replies

antimemetics1y ago

I mean this is what they recommend:

So you will spend 150k+/year (2 senior full stake eng salaries in EU - can be much higher, esp for people up to the task) to save 60k+/y in infra costs?

Does not compute for me - is the lock-in that bad?

I understand it for very small/simple use cases - but then do you need k8s at all?

It feels like the ones who will benefit the most is orgs who spend much more on cloud costs - but they need SLAs, compliance and a dozen other enterprisy things.

So I struggle to understand who would benefit from this stack reclaim.

1 more reply

efilife1y ago

Fyi, we use asterisks (*) for emphasis on HN

willvarfar1y ago

underscores around italics and asterisk around strong/bold was an informal convention on bbs, irc and forums way before atx/markdown.

1 more reply

Kiro1y ago

Different thing. Using visible _ is a conscious choice.

1 more reply

komali21y ago

Who's "we?"

2 more replies

sph1y ago· 9 in thread

"Join the Discord server"? Who's the audience of this project?

mre1y ago

Genuinely curious, what's wrong with that? Did you expect a different platform like Slack?

callalex1y ago

2 more replies

Gormo1y ago

There's a whole FOSS ecosystem of chat/collaboration applications, like Mattermost and Zulip; there's Matrix for a federated solution, and tried-and-true options like IRC.

For something called "Reclaim the Stack" to lock discussion into someone else's proprietary walled garden is quite ironic.

fragmede1y ago

it would be better at the bottom of the first documentation page, after the reader has a better idea of what this is

tacker20001y ago

Also noticed this. Everytime I see a project using discord as main communication tool it makes me think about the “fitness” of the project in the long run.

Discord is NOT a benefit. Its not publicly searchable and the chat format is just not suitable to a knowledge base or support based format.

Forums are much better in that regard.

KronisLV1y ago

> Discord is NOT a benefit. Its not publicly searchable and the chat format is just not suitable to a knowledge base or support based format.

That said, modern forum software like Discourse https://www.discourse.org/ or Flarum https://flarum.org/ can be pretty good, though I still miss phpBB.

1 more reply

mrits1y ago

People that don't like wasting money?

hobs1y ago

Not capturing the information and being able to use it in the future is a huge opportunity cost, and idling on discord pays no bills.

Gormo1y ago

Wasting money on... better solutions that are also free?

1 more reply

andrewstuart1y ago· 9 in thread

How can a NewsDesk application need kubernetes?

Wouldn't a single machine and a backup machine do the job?

dbackeus1y ago

Because it's a fully featured public relations platform, not just a "newsdesk" (though that's what it started as some 20 years ago).

bluepizza1y ago

Most simple applications that use k8s are doing it for autoscaling or no downtime continuous deployment (or both).

wordofx1y ago

So basically 2 things you don’t need k8s to solve?

2 more replies

briandear1y ago

Is business running a complete application stack on a single machine?

notpushkin1y ago

A lot of businesses don’t need more than a couple machines (and can get away with one, but it’s not good for redundancy).

mrweasel1y ago

Frequently yes, normally I'd say that the database server is on a separate machine, but otherwise yes.

I've seen companies run a MiniKube installation on a single server and run their applications that way.

liveoneggs1y ago

my vps reboots every 18 months or so..

andrewstuart1y ago

I just looked it up - its because they run Ruby On Rails.

zymhan1y ago

and so what?

1 more reply

fourseventy1y ago· 8 in thread

In my experience you can get pretty far with just a handful of vms and some bash scripts. At least double digit million ARR. Less is more when it comes to devops tooling imo.

lolinder1y ago

> you can get pretty far with just a handful of vms and some bash scripts. At least double digit million ARR.

davedx1y ago

> there are a lot of perks that come with a full-fledged control plane that would almost certainly be worth the added cost and complexity.

Such as?

Logging is more complicated with multi container microservice deployments. Deploying is more complicated. Debugging and error tracing is more difficult. What are the perks?

1 more reply

Maxion1y ago

marcosdumay1y ago

> double-digit million MAUs

I was about to make a similar point, but you made the math, and it's holding-up for the GP's side.

You can push vms and direct to ssh synchronization up to double-digit million MAU (unless you are using stuff like persistent web-sockets). It won't be pretty, but you can get that far.

1 more reply

sanderjd1y ago

Of course you can get away with that if your metric is revenue. (I think Blippi makes about that much with, I suspect, nary a VM in sight!

The question is what you're doing with your infrastructure, not how much revenue you're making. Some things have higher return to "devops" and others have less.

kevin_nisbet1y ago

cloudking1y ago

+1 or just use App Engine, deploy your app and scale

cglace1y ago

App engine deploys are soooo slow. I liked cloud run a lot more.

appplication1y ago· 7 in thread

This sounds great, I’ll be building our prod infra stack and deploying to cloud for the first time here in the next few weeks, so this is timely.

notpushkin1y ago

I really hated Kubernetes at first because the tooling is so complicated. However, having worked with raw Docker API and looking into the k8s counterparts, I’m starting to appreciate it a lot more.

(But it still needs more accessible tooling! Kompose is a good start though: https://kompose.io/)

dbackeus1y ago

Feel free to join the RtS discord if you want to bounce ideas for your upcoming infra

briandear1y ago

The K8s is unnecessary meme is perpetuated by people that don’t understand it.

actionfromafar1y ago

True, but also, sometimes it’s not needed.

1 more reply

freeopinion1y ago

If they don't understand it but still get their jobs done...

Tractors are also unnecessary. Plenty of people grow tomatos off their balcony without tractors.

If somebody insists on growing 40 acres of tomatos without a tractor because tractors aren't necessary, why argue with them? If they try to force you to not use a tractor, that's different.

sph1y ago

okasaki1y ago

Just had an incident call last week with 20+ engineers on zoom debugging a prod k8s cluster for 5 hours.

deisteve1y ago· 5 in thread

i got excited until i saw this was kubernetes. you most certainly do not need to add that layer of complexity.

If I can serve 3 million users / month on a $40/month VPS with just Coolify, Postgres, Nginx, Django Gunicorn without Redis, RabbitMQ why should I use Kubernetes?

dbackeus1y ago

Coolify does look nice.

But I don't believe it supports HA deployments of Postgres with automated failover / 0 downtime upgrades etc?

Do they even have built in backup support? (a doc exists but appears empty: https://coolify.io/docs/knowledge-base/database-backups)

What makes you feel that Coolify is significantly less complex than Kubernetes?

mrweasel1y ago

> why should I use Kubernetes

itsthecourier1y ago

Got a bill from usd10k to usd0.5k a month by moving away from gcp to Kamal in ovh

And 30% less latency

deisteve1y ago

thats 95% in savings!!!! bet you can squueze more with hetzner

to ppl who disagree,

what business justifies 18x'ing your operating costs?

9.5k USD can get you 3 senior engineers in Canada. 9 in India.

1 more reply

Kiro1y ago

Why do you need Coolify?

b_shulha1y ago· 5 in thread

Who are your target audience? There are so many components in this system, so it would require a dev-ops team member just to keep it healthy.

What are the advantages over the (free) managed k8s provided by DigitalOcean?

---

dbackeus1y ago

> Who are your target audience?

Anyone looking for a PaaS alternative matching or exceeding the UX of Heroku.

The "is it for you" section of our Introduction may give a better idea: https://reclaim-the-stack.com/docs/kubernetes-platform/intro...

> What are the advantages over the (free) managed k8s provided by DigitalOcean?

You can run the platform on top of any Kubernetes deployment. So you can run it on top of DigitalOcean kubernetes if you wish. But you'll get more bang for the buck using Hetzner dedicated servers.

b_shulha1y ago

I've read the Introduction, but still have no idea why I need to use this platform instead of a managed k8s provided by DO.

It probably makes sense to put a few words on the "components" as well, as it seems to be the main selling point and not the privacy/GDPR concerns.

b_shulha1y ago

Oh, thanks for asking. ;)

https://github.com/ptah-sh/ptah-server

But anyway, I'm really curious to know the answers to the questions I have posted above. Thanks!

KronisLV1y ago

I mean, I also use Docker Swarm and it's pretty good, especially with Portainer.

To me, the logical order of tools goes with scale a bit like this: Docker Compose --> Docker Swarm --> Hashicorp Nomad / Kubernetes

(with maybe Podman variety of tools where needed)

I've yet to see a company that really needs the latter group of options, but maybe that's because I work in a country that's on the smaller side of things.

b_shulha1y ago

While k3s make k8s easier for sure, it still comes with lots of complexity on board just because it is k8s. :)

Nowaday I prefer simple tooling over "flexible" for my needs.

Enterprises, however, should stick to k8s-alike solutions, as there are just too many variables everywhere: starting from security, and ending the software architecture itself.

strzibny1y ago· 4 in thread

[0] https://kamal-deploy.org [1] https://kamalmanual.com/handbook/

dbackeus1y ago

(Reclaim the Stack creator here)

We don't do autoscaling.

The main reason for Kubernetes for us was automation of monitoring / logs / alerting and highly available database deployments.

(that said, I do like Kamal, especially v2 seems to smooth out some edges, and I'm all for simple single server deployments)

leohonexus1y ago

Bought both your books, they are awesome :)

mplewis1y ago

I’m not going to trust a project like this – made by and for one company – with production workloads.

rcaught1y ago

hahaha, do you even realize what else this company makes?

Summerbud1y ago· 4 in thread

> The results were a 90% reduction in costs and a 30% improvement in performance.

I am in a company with dedicated infra team and my CEO is a infra enthusiastic. He use terraform and k8s to build the company's infra. But the results are.

- Every deployment take days, in my experience, I need to woke for 24 hr streak to make it work. - The infra is complicated to a level that quite hard to adjust

And benefits wise, I can't even think about it. We don't have many users so the claimed scalability is not even there.

I will strongly argue startup should not touch k8s until you have fair user base and retention.

It's a nightmare to work with.

raziel2p1y ago

sounds like your CEO just isn't very good at setting up infra.

Summerbud1y ago

Maybe, that is one of the possibilities in my mind too.

cultofmetatron1y ago

DAYS??? our infra takes 10 min usually with up to 45 min if we're doing some postgres maintenance stuff. People in a work context should stick to what they are good at.

tryauuum1y ago

...but why? How many services the deployment requires?

PaulHoule1y ago· 2 in thread

What about “the rest of us” who don’t have time for Kube?

notpushkin1y ago

If you know how to write a docker-compose.yml – Docker Swarm to the rescue! I’m making a nice PaaS-style thing on top of it: https://lunni.dev/

You can also use Kubernetes with compose files (e.g. with Kompose [1]; I plan to add support to Lunni, too).

[1]: https://kompose.io/

tacker20001y ago

Im using docker compose on every project I have, and it works fine.

Of course, I dont have millions of users, but until then this is enough for me.

fragmede1y ago· 2 in thread

having your tool be a single letter, k, seems rather presumptuous.

rjbwork1y ago

Especially given K is already the name of an APL derivative.

dbackeus1y ago

I suppose it is. But no actual users of the platform has had any complaints about it.

AbuAssar1y ago· 1 in thread

How does this compare to dokku (https://dokku.com/)?

dbackeus1y ago

Main difference is that Dokku is a simple single server platform, geared mostly toward hobby projects.

Reclaim the Stack provides a fully highly available multi node platform to host large scale SaaS applications aiming for four nines of uptime.

evantahler1y ago· 1 in thread

Porter (https://www.porter.run/) is a great product in the same vein (e.g. turn K8s into a dev-friendly Heroku-like PASS). How does this compare?

mikeortman1y ago

I think the very concept of this is to open source a common stack, instead of relying on a middleman like Porter, which also costs a TON of money at business tier

sciurus1y ago· 1 in thread

> Replicas are used for high availability only, not load balancing

(From https://reclaim-the-stack.com/docs/platform-components/ingre...)

An I reading this right that they built a k8s-based platform where by default they can't horizontally scale applications?

This seems like a lot of complexity to develop and maintain if they're running applications that don't even need that.

dbackeus1y ago

That said, there is some kind of balancing across multiple cloudflared replicas. But when we measured the traffic Cloudflare sent ~80% of traffic to just one of the available replicas.

We haven't looked into what the actual algorithm is. It may well be that load starts getting better distributed if we were to start hitting the upper limits of a single replica.

pton_xd1y ago· 1 in thread

"The results were a 90% reduction in costs and a 30% improvement in performance."

What's the scale of this service? How many machines are we talking here?

internetter1y ago

Went from $~7500 to $520/m iirc from the presentation

pwmtr1y ago· 1 in thread

evertheylen1y ago

PS: I like what you guys are doing, I'd subscribe to your mailing list if you had one! :)

thih91y ago· 1 in thread

> fully open source stack*. *) Except for Cloudflare

Are there plans to address that too long term?

dbackeus1y ago

Not from our point of view since Cloudflare's DDOS production and CDN is a crucial part of our architecture.

notpushkin1y ago

It looks like a nice Kubernetes setup! But I don’t see how this is comparable to something like Heroku – the complexity is way higher from what I see.

aliasxneo1y ago

airstrike1y ago

> We spent 7 months building a Kubernetes based platform to replace Heroku for our SaaS product at mynewsdesk.com. The results were a 90% reduction in costs and a 30% improvement in performance.

mikeortman1y ago

I'm glad we are starting to lean into cloud-agnostic or building back the on-prem/dedicated systems again.

zug_zug1y ago

noop_joe1y ago

Heroku and Reclaim are far from the only two options available. The appropriate choice depends entirely on the team's available expertise and the demands of the applications under development.

The same is true for reclaim. _If_ you're familiar with all of the tooling, you could host an application with more functionality for less money than heroku.

chrisweekly1y ago

Retr0id1y ago

Based on the title alone, I thought this was going to be people up in arms about -fomit-frame-pointer being used by distros

hintymad1y ago

jonstewart1y ago

> Having started with Heroku, we have maintained a similar level of security

Remember 2022? https://www.bleepingcomputer.com/news/security/heroku-admits...

Havoc1y ago

Toying with self hosted k8s at home has taught me that it it’s the infra equivalent of happy path coding.

Works grand until it blows up in your face for non obvious reasons

That’s definitely mostly a skill issue on my end but still would make me very wary betting a startup on it

kh_hk1y ago

> We spent 7 months building a Kubernetes based platform to replace Heroku for our SaaS product at mynewsdesk.com.

I thought this was either a joke I was missing, or a rant about Kubernetes. It turned out it was neither, and now I am confused.

thesurlydev1y ago

est1y ago

> We spent 7 months building a Kubernetes based platform to replace Heroku for our SaaS product

And heroku is based on LXC containers. I'd say it's almost the same thing.

mvkel1y ago

> 90% reduction in costs

Curious what accounts are being attributed to said costs.

Many new maintenance-related lines will be added, with only one (subscription) removed.

seungwoolee5181y ago

Most of the software should work Out-Of-The-Box, but the real problem is coming from hardware.

GaryNumanVevo1y ago

Potential irony, this site isn't loading for me

j / k navigate · click thread line to collapse