Why we moved from AWS RDS to Postgres in Kubernetes (opens in new tab)

(nhost.io)

205 pointselitan3y ago145 comments

145 comments

83 comments · 19 top-level

radimm3y ago· 19 in thread

Having recently heard a lot of about PostgreSQL in Kubernetes (cloudNativePG for example) it always makes me wonder about the actual load and the complexity of the cluster in the question.

> This is the reason why we were able to easily cope with 2M+ requests in less than 24h when Midnight Society launched

This gives the answer, while it's probably not evenly distributed gives 23 req/sec (guess peak 60 - 100 might be already stretching it). Always wonder about use cases around 3 - 5k req/sec as minimum.

[edit] PS: not really ditching neither k8s pg nor AWS RDS or similar solutions. Just being curious.

Nextgrid3y ago

> 23 req/sec (guess peak 60 - 100 might be already stretching it)

That kind of load is something a decent developer laptop with an NVME drive can serve, nothing to write home about.

It is sad that the "cloud" and all these supposedly "modern" DevOps systems managed to redefine the concept of "performance" for a large chunk of the industry.

jerf3y ago

I can't blame it on "cloud", though it's not helping that there are an awful lot of cloud services that claim to be "high performance" and are often mediumish at best. But in general I see a lot of ignorance in the developer community as to how fast things should be able to run, even in terms of reading local files and doing local manipulations with no "cloud" in sight.

Honestly, if I had to pin it on just one thing, I'd blame networking everything. Cloud would fit as a subset of that. Networking slows things down at the best of times, and the latency distribution can be a nightmare at the worst. Few developers think about the cost of using the network, and even fewer can think about it holistically (e.g., to avoid making 50 network transactions spread throughout the system when you could do it all in one transaction if you rearranged things).

2 more replies

mhuffman3y ago

It does depend on the architecture and framework they are using imo. I have a single Hetzer machine with spinning plate HDs that serves between 1-2 million requests per day hitting DB and ML models and rarely every gets over 1% CPU usage. I have pressure-tested it to around 3k reqs/sec. On the other hand I have seen WP and CodeIgniter setups that even with 5 copies running on the largest AWS instances available, "optimized" to the hilt, caching everywhere possible, etc. absolute crumble under the load of 3k req per min. (not sec ... min).

Many frameworks that make early development easy fuck you later during growth with ORM calls, tons of unnecessary text in the DB, etc.

3 more replies

rrampage3y ago

It depends a lot on the backend architecture. Number of DB requests per web request can also be high due to the pathological cases in some ORMs which can result in N+1 query problems or eagerly fetching entire object hierarchies. Such problems in application code can get brushed under the carpet due to "magical" autoscaling (be it RDS or K8s). There can also be fanout to async services/job queues which will in turn run even more DB queries.

2 more replies

singron3y ago

RDS tops out at about 18000 IOPS since it uses a single ebs volume. Any decent ssd will do much better. E.g. a 970 evo will easily do >100K IOPS and can do more like 400K in ideal conditions.

You can get that many IOPS with aurora, but the cost is exorbitant.

3 more replies

ayende3y ago

You are off by a couple of orders of magnitude

I have run 500+ req/sec on a raspberry pi using 4 TB dataset with 2 GB of RAM, with under 100ms for the 99.99 percentile

A few hundreds req a second is basically nothing.

derefr3y ago

Depends on the queries. Point queries that take 1ms each? Sure. Analytical queries that take 1000ms+ each? Not so much.

StreamBright3y ago

I see this problem a bit more nuanced. Why does everybody starting with the assumption that the solution is SQL? You can get very far with a k:v store like S3 for example. On the top of that, if you really need SQL you can use a lot of different systems (without k8s).

c2h5oh3y ago

That kind of a load you can handle on spinning rust without breaking a sweat.

eptcyka3y ago

NVME? You can serve this from a raspberry pi.

xani_3y ago

It's essentially just a process running in a cgroup so performance shouldn't be all that different than bare metal/VM postgresql.

Main difference would be storage speed and how it exactly is attached to a container.

kccqzy3y ago

> This is the reason why we were able to easily cope with 2M+ requests in less than 24h

I thought this was referring to 2M+ requests per second over a ramp period of 24h, not 2M requests per 24h?

XCSme3y ago

2M+ requests per day can be handled on a pretty cheap VPS even by MySQL, but it depends on the request complexity and, more importantly, the database size.

brand3y ago

I’ve personally deployed O(TBs) and O(10^4 TPS) Postgres clusters on Kubernetes with a CNPG-style operator based deployment. There are some subtleties to it but it’s not exceeding complicated, and a good project like CNPG goes a long way to shaving off those sharp edges. As other commenters have suggested it’s good to really understand Kubernetes if you want to do it, though.

radimm3y ago

Thanks for the confirmation. As mentioned I'm not saying no to it. It is really that "really understand" part which holds me back for now - mainly the observability and dealing with edge cases in high-throughput environment.

remram3y ago

> O(TBs) and O(10^4 TPS)

What does this syntax mean? Surely you wouldn't use big-o notation with a constant in it, especially to convey the same meaning as the thing without the O?

1 more reply

MuffinFlavored3y ago

> Having recently heard a lot of about PostgreSQL in Kubernetes

I could never get a straight answer on whether running a database in a container (and mounting the storage volume through a bind mount/network drive or whatever) came with a performance hit compared to running it as a systemd service for example.

speedgoose3y ago

It does but it’s minimal. Especially compared to the high latency and low throughout network volumes provides (which are the defaults on cloud VMs).

ahachete3y ago

In case you are interested, I blogged about it last year: https://thenewstack.io/kubernetes-will-revolutionize-enterpr...

TL;DR performance impact should be negligible, could be even slightly negative compared to a VM (when running K8s on bare metal).

1 more reply

qeternity3y ago· 11 in thread

These threads are always full of people who have always used an AWS/GCP/Azure service, or have never actually run the service themselves.

Running HA Postgres is not easy...but at any sort of scale where this stuff matters, nothing is easy. It's not as if AWS has 100% uptime, nor is it super cheap/performant. There are tradeoffs for everyone's use-case but every thread is full of people at one end of the cloud / roll-your-own spectrum.

ftufek3y ago

Honestly, that's what I initially thought trying to run ha postgres on k8s, but zalando's postgres operator made things so much easier (maybe even easier than RDS). Very easy to rollout as many postgres clusters with whatever size you want. We've been running our production db on it for the last 6 months or so, no outage yet. Though I guess if you have to have a very custom setup, it might be more difficult.

manfre3y ago

Have you tested the backup/recovery for any of the DBs yet? I'm curious to hear how that went.

2 more replies

5Qn8mNbc2FNCiVV3y ago

What is the underlying storage used for this? I kind of still struggle wrapping my head around if I should use hostpath and let the operator manage replication or do I actually want a distributed storageclass? I'm always searching but I can't seem to find an answer to this question

qeternity3y ago

Yes as I mention in an earlier comment we use Patroni and love it.

api3y ago

I wonder how many people use things like CockroachDB, Yugabyte, or TiDB? They're at least in theory far easier to run in HA configurations at the cost of some additional overhead and in some cases more limited SQL functionality.

They seem like a huge step up from the arcane "1980s Unix" nightmare of Postgres clustering but I don't hear about them that much. Are they not used much or are their users just happy and quiet?

(These are all "NewSQL" databases.)

sgtfrankieboy3y ago

We have multiple CockroachDB clusters, have been for 4+ years now. From 2TB to 14TB in used size, the largest does about 3k/req sec.

We run them on dedicated hosts or on Hetzner cloud instances. We tested out RDS Postgres, but that would've literally tripled our cost for worse performance.

Only had a few hiccups with the big cluster but they were resolved quickly with their support.

We're very happy with the product, and have leaned quite a few optimization tricks to get the best out of it. Easy to use as well, join the nodes and it just works.

It's not perfect though, we've had quite a few issues with deleting lots of data at once, it doesn't like that. So we have to do deleted in smaller chunks.

KronisLV3y ago

> I wonder how many people use things like CockroachDB, Yugabyte, or TiDB?

TiDB is a pretty interesting project, but there are a few limitations that should be taken into account when trying to use it: https://docs.pingcap.com/tidb/stable/mysql-compatibility

A lot of these are tradeoffs that will affect how a database can be architected, such as having no access to foreign keys and thus needing to think about any sort of consistency and not leaving orphaned data at the application level.

tluyben23y ago

We are testing all our software on yugabyte now to see how well it works. The cockroach license makes it not a fit for us, so we decided to try Yuga. So far works very well for our workloads.

belmont_sup3y ago

New user of cockroach. We’ll find out! If this startup ever makes it to any meaningful user sizd

9887473y ago

I've been successfully running Postgres in Kubernetes with the Operator from Crunchy Data. It makes HA setup really easy with a tool called Patroni, which basically takes care of all the hard stuff. Running 1 primary and 2 replicas is really no harder than running single-node Postgres.

qeternity3y ago

Yes as I mention in an earlier comment we use Patroni and love it.

xwowsersx3y ago· 8 in thread

I've recently been spending a fair amount of time trying to improve query performance on RDS. This includes reviewing and optimizing particularly nasty queries, tuning PG configuration (min_wal_size, random_page_cost, work_mem, etc). I am using a db.t3.xlarge with general purpose SSD (gp2) for a web server that sees moderate writes and a lot of reads. I know there's no real way to know other than through testing, but I'm not clear on which instance type best serves our needs — I think it may very well be the case that the t3 family isn't fit for our purposes. I'm also unclear on whether we ought to switch to provisioned IOPS SSD. Does anyone have any general pointers here? I know the question is pretty open-ended, but would be great if anyone has general advice from personal experience?

notac3y ago

I'd recommend hopping off of t3 asap if you're searching for performance gains - performance can be extremely variable (by design). M class will even you out.

General storage IOPS is governed by your provisioned storage size. You can again get much more consistent performance by using provisioned IOPS.

Feel free to email me if you want to chat through things specific to your env - email is in my about:

aeyes3y ago

I would advise that you try to fit the working set into memory before spending on provisioned IOPS. Reading a lot of data from network storage constantly should be avoided as much as possible, having more IOPS doesn't necessarily improve read latency.

brazzledazzle3y ago

Provisioned IOPS is much more expensive though, so make sure you really need it. If you use general IOPS you can monitor your burst balance. You can always start with general and then move to provisioned when you need it too.

1 more reply

xwowsersx3y ago

Thank you so much, will definitely take you up on the offer.

Nextgrid3y ago

It's hard to say without metrics; what does your CPU load look like? In general, unless your CPU is often maxing out, changing the CPU is unlikely to help, so you're left with either memory or IO.

Unused memory on Linux will be automatically used to cache IO operations, and you can also tweak PG itself to use more memory during queries (search for "work_mem", though there are others).

If your workload is read-heavy, just giving it more memory so that the majority of your dataset is always in the kernel IO cache will give you an immediate performance boost, without even having to tweak PG's config (though that might help even further). This won't transfer to writes - those still require an actual, uncached IO operation to complete (unless you want to put your data at risk, in which case there are parameters that can be used to override that).

For write-heavy workloads, you will need to upgrade IO; there's no way around the "provisioned IOPS" disks.

xwowsersx3y ago

Thanks very much for the reply. CPU is not often maxing out. Here's a graph of max CPU utilization from the last week https://ibb.co/tzw5p3L

1 more reply

paulryanrogers3y ago

General storage IOPS scales with disk size, roughly and to a point. It's often cheaper and faster to increase the instance storage than move to EBS, prioritized or not.

Of course if you need to recover quickly in a disaster you'll want a hot standby or replica. Still may be cheaper than PIOPs. (Especially if you need HA anyway.)

xwowsersx3y ago

Thanks for the reply.

> It's often cheaper and faster to increase the instance storage than move to EBS, prioritized or not.

You're saying it may well be cheaper to increase storage in order to get more IOPS than moving to an EBS-optimized instance type?

Regarding HA, not relevant for at this point (assuming I understood you correctly). We've only got a single primary and one replica, the latter being used primarily for analytics.

nunopato3y ago· 7 in thread

(Nhost)

Sorry for not answering everyone individually, but I see some confusion duo to the lack of context about what we do as a company.

First things first, Nhost falls into the category of backend-as-a-service. We provision and operate infrastructure at scale, and we also provide and run the necessary services for features such as user authentication and file storage, for users creating applications and businesses. A project/backend is comprised of a Postgres Database and the aforementioned services, none of it is shared. You get your own GraphQL engine, your own auth service, etc. We also provide the means to interface with the backend through our official SDKs.

Some points I see mentioned below that are worth exploring:

- One RDS instance per tenant is prohibited from a cost perspective, obviously. RDS is expensive and we have a very generous free tier.

- We run the infrastructure for thousands of projects/backends which we have absolutely no control over what they are used for. Users might be building a simple job board, or the next Facebook (please don't). This means we have no idea what the workloads and access patterns will look like.

- RDS is mature and a great product, AWS is a billion dolar company, etc - that is all true. But is it also true that we do not control if a user's project is missing an index and the fact that RDS does not provide any means to limit CPU/memory usage per database/tenant.

- We had a couple of discussions with folks at AWS and for the reasons already mentioned, there was no obvious solution to our problem. Let me reiterate this, the folks that own the service didn't have a solution to our problem given our constraints.

- Yes, this is a DIY scenario, but this is part of our core business.

I hope this clarifies some of the doubts. And I expect to have a more detailed and technical blog post about our experience soon.

By the way, we are hiring. If you think what we're doing is interesting and you have experience operating Postgres at scale, please write me an email at nuno@nhost.io. And don't forget to star us at https://github.com/nhost/nhost.

akrymski3y ago

Indeed RDS was never designed to be "re-sold", and assuming that a single PG instance will handle lots of different users is naive. Turns out if you're aiming to be an infra provider, building your own infra is the way to go. Who would have thought?

If I was launching a BaaS I wouldn't touch AWS. Grab a few Hetzner bare metal servers and setup your infra. You're leaving a massive profit margin to AWS when you don't have to.

fmajid3y ago

Are you using a Kubernetes PostgreSQL operator like pgo or CloudNativePG?

https://proopensource.it/blog/postgresql-on-k8s-experiences

SomaticPirate3y ago

Also would like to know this. This post is a bit light on content. It sounds like they just moved to K8s from RDS. In my experience, Postgres works decently but there are sharp edges running it containerized (OOMS in subprocesses might not be caught by the container runtime, shared memory is pitifully low in docker at 64 MB by default)

1 more reply

cloudbee3y ago

And what are your cost savings from RDS perspective. I'd a similar problem where we'd to provision like 5 databases for 5 different teams. RDS is really expensive. And your solution is open source ? I would like to try.

SOLAR_FIELDS3y ago

RDS and similar managed databases are over half of our total cloud bill at my place of work. Managed databases in general are really expensive.

1 more reply

nunopato3y ago

I hope to have a more detailed analysis to share when we have more accurate data. We launched individual instances recently and although I don't have exact numbers, the price difference will be significant. Just imagine how much it would cost to have 1 RDS instance per tenant (we have thousands).

We haven't open-sourced any of this work yet but we hope to do it soon. Join us on discord if you want to follow along (https://nhost.io/discord).

jrockway3y ago

I'm guessing that they're betting that they can put X idle customers on one machine, and so pay X/machine cost for their free tier.

A while ago, I worked for a company that offered a hosted version of their application that required Postgres, etcd, Kubernetes, etc. It was set up so that every customer got their own GCP project, containing a K8s cluster, Cloud Storage, and a Postgres instance, The k8s cluster ("workspace") then contained dedicated nodes (4vCPU x 16G RAM at a minimum, autoscaling up according to their workload including GPU compute), SSDs, a public-facing LoadBalancer, etc. This is good for per-customer isolation, but quite costly at idle, on the order of several hundred dollars a month. Users expect this kind of isolation (but need the SOC2 and similar checkmark for sure), but they don't expect to be charged when they're not running anything, which was a problem for us.

If I was doing this again, I would do it this way, at least for the MVP. One option is to make the application multi-tenant aware, and isolate at the application level instead of at the GCP project level. This might be more difficult to get certified and might not meet everyone's HIPAA-like compliance goals, but is a good starting point, especially for free trials.

The other option that was very appealing to me is to give each user a VM that just gets de-scheduled when no requests are being made. Instead of k8s managing nodes, nodes would manage k8s. The downside there is that cluster size is limited to whatever the largest node you can buy is, but honestly, 448vCPUs is a ton (AWS's max instance size at the moment), so it's a very workable solution. When users sign up, create a VM image that runs K8s, Minio, Postgres, etc. and route traffic to it with a shared L7 router/front proxy. If their workloads autoscale up, freeze and migrate the VM to a machine with more resources. If they're not using it for a while, freeze it completely, and reprogram your front proxy to point at a program that waits for an RPC / web request and starts up the VM when one comes in. Now your idle cost is the cost of your block storage, modulo deduplication, instead of dedicated CPU cores and RAM. You also get a lot of knobs to control your actual compute cost; you aren't reliant on your users provisioning spot instances from their cloud provider, you can just tell cron jobs to run when CPU load is lowest, or set your own rate to incentive off-peak usage. And, you can pretty much get away with charging nothing for idle instances, limit free trials in aggregate to X CPU cores, etc. I think it would have been good, though complex.

TL;DR: RDS is a highly-available always-on service. But customers might not want HA or always-on. By being able to turn off the database at the right moment, you can save a lot of money on compute, which makes things like good free trials more economically viable. I think OP is on the right track to a successful k8s-based business and wish them great luck!

0xbadcafebee3y ago· 7 in thread

Ah, the 'ol sunk cost fallacy of infrastructure. We are already investing in supporting K8s, so let's throw the databases in there too. Couldn't possibly be that much work.

Sure, a decade-old dedicated team at a billion-dollar multinational corporation has honed a solution designed to support hundreds of thousands of customers with high availability, and we could pay a little bit extra money to spin up a new database per tenant that's a little bit less flexible, ..... or we could reinvent everything they do on our own software platform and expect the same results. All it'll cost us is extra expertise, extra staff, extra time, extra money, extra planning, and extra operations. But surely it will improve our product dramatically.

suggala3y ago

AWS RDS is 10x slower than BareMetal MySQL (both reads and writes). Slowness is mainly due to the reason that Storage is over network for RDS.

Not bad to invest some extra time to get better performance.

You are falling to “Appeal to antiquity” fallacy if you think something old is better.

Nextgrid3y ago

It's unlikely running it on K8S (which is itself going to run on underpowered VMs with networked storage) is going to help.

If you're gonna spend effort in running Postgres manually, do it on bare-metal and at least get some reward out of it (performance and reduced cost).

2 more replies

0xbadcafebee3y ago

What you describe is still a fallacy because it's assuming that just because you can get better performance with BareMetal, that somehow this is a cheaper or better option. In fact it will be either more error-prone, or more expensive, or both, because you are trying to reproduce from scratch what the whole RDS team has been doing for 10 years.

1 more reply

gw993y ago

I'm not so sure. All you have is another layer of abstraction between you and the problem that you are facing. And that level of abstraction may violate your SLAs unless you pitch $15k for the enterprise support option. And that may not even be fruitful because it relies on an uncertain network of folk at the other end who may or may not even be able to interpret and/or solve your problem. Also you are at the whim of their changes which may or may not break your shit.

Source: AWS user on very very large scale stuff for about 10 years now. It's not magic or perfection. It's just someone else's pile of problems that are lurking. The only consolation is they appear to try slightly harder than the datacentres that we replaced.

dijit3y ago

And when it all goes bottoms up it will be much more difficult to resolve.

throwawaymaths3y ago

Depends. A lot of postgres usage is often "things that might as well be redis", like session tokens (but the library we imported came configured for postgres) so if the postgres goes down, as long as it can be restarted it won't be the end of the world even if all the data were wiped.

Probably there is also an 80/20 for most users where it's not awful if you can restore from a cold storage, say 12h, backup.

baq3y ago

Fortunately Postgres doesn’t do that often by itself. It usually needs some creative developer’s assistance.

1 more reply

jmarbach3y ago· 3 in thread

$0.50 per extra GB seems high, especially for a storage-intensive app. Given the cost of cloud Object Storage services it doesn't seem to make much sense.

Examples of alternatives for managed Postgres:

* Supabase is $0.125 per GB

* DigitalOcean managed Postgres is ~$0.35 per GB

claytongulick3y ago

I really wish I could use DO, but unless something has changed recently, they don't support delta backups, which is a deal killer for me.

For small startups, my DR/HA plan is hourly delta snapshots of the whole volume.

GCP, AWS and Azure all make this possible.

makestuff3y ago

SUpabase runs on AWS so they are either losing a ton of money, have some amazing deal with AWS, or the $0.50 is inaccurate.

kiwicopple3y ago

(supabase ceo)

EBS pricing is here: https://aws.amazon.com/ebs/pricing/

I'd have to check with the team but I'm 80% sure we're on gp3 ($0.08/GB-month).

That said, we have a very generous free tier. With AWS we have an enterprise plan + savings plan + reserved instances. Not all of these affect EBS pricing, but we end up paying a lot less than the average AWS user due to our high-usage.

MBCook3y ago· 2 in thread

So they switch from one giant RDS instance with all tenants per AZ to per-tenant PG in Kubernetes.

So really we don’t know how much RDS was a problem compared to the the tenant distribution.

For the purposes of an article like this it would be nice if the two steps were separate or they had synthetic benchmarks of the various options.

But I understand why they just moved forward. They said they consulted experts, it would also be nice to discuss some of what they looked or asked about.

raffraffraff3y ago

Yeah. I mean, if you're going to use AWS database service for this use case, something that automatically scales based on load makes more sense, like Aurora Serverless. But that's also expensive. Regardless of cost, plain RDS isn't the right solution here as all.

eptcyka3y ago

Yes, that's basically the whole point of the article - an assumption they may not have made all too consciously to use RDS turned into a bad decision they sought to rectify.

xyzzy_plugh3y ago· 2 in thread

If the cost of operating a postgres database is eating into your margins so much (and you can't simply adjust your prices to eat the difference) then I would suspect the wrong technology is in place.

Sure, RDS is expensive, but it's also quite well done. Almost every cloud platform service is more expensive than doing it yourself. No surprise here.

In the past I've deployed SQLite over Postgres for cost cutting reasons. It's not too difficult to swap out unless you're heavily bought into database features.

movedx3y ago

> Almost every cloud platform service is more expensive than doing it yourself. No surprise here.

In a business environment, this is actually not true unless you consider the extreme long term.

A Multi-AZ MySQL RDS instance of size db.m1.large (2x vCPU, 7.5GB of RAM), a 500GB standard disk, and an on-demand pricing model with 100% monthly utilization, will cost you approx. US$7,000 per year (rounding up.) That price gets you almost everything you can imagine from that service.

US$7,000 wouldn't get you my services for the time needed to setup a service that came even 30% as close in terms of reliability, feature parity and support.

RDS is not expensive (in the right environment.)

xyzzy_plugh3y ago

Agreed. RDS is a no-brainer, generally. The issue here appears to be with the unit economics per tenant. My argument is if the unit economics matter so much, the technology choice is likely a poor fit.

neilv3y ago· 1 in thread

I didn't see "backups" mentioned in that, though I'm sure they have them. Depending on your needs, it's a big thing to keep in mind while weighing options.

For a small startup or operation, a managed service having credible snapshots, PITR backups, failover, etc. is going to save a business a lot of ops cost, compared to DIY designing, implementing, testing, and drilling, to the same level of credibility.

One recent early startup, I looked at the amount of work for me or a contractor/consultant/hire to upgrade our Postgres recovery capability (including testing and drills) with confidence. I soon decided to move from self-hosted Postgres to RDS Postgres.

RDS was a significant chunk of our modest AWS bill (otherwise, almost entirely plain EC2, S3, and traffic), but easy to justify to the founders, just by mentioning the costs it saved us for business existential protection we needed.

nunopato3y ago

Thanks for bringing this up. We do have backups running daily, and we will have "backups on demand" soon as well.

qubit233y ago· 1 in thread

I was hoping to see a bit more of an explanation of how this was implemented.

elitanOP3y ago

We need a follow up: *How* we're running thosands of Postgres databases in Kubernetes.

HL33tibCe73y ago· 1 in thread

Couldn’t you just spin up an RDS instance for each project (so, single-tenant RDS instances) to avoid the noisy neighbour problem? Or is that too expensive?

elitanOP3y ago

We could, yes. But way to expensive compared to our current setup.

We're offering free projects (Postgres, GraphQL (Hasura), Auth, Storage, Serverless Functions) so we need to optimize costs internally.

techn003y ago· 1 in thread

So what solution did you end up using? Crunchy operator?

nesmanrique3y ago

We evaluated several operators but at the end decided it would be best to deploy our own setup for the postgres workloads instead using helm.

e-clinton3y ago· 1 in thread

Congrats on the launch. Curious to see what else is in store for this week.

Do I have to manually upgrade my old instances?

elitanOP3y ago

Thank you. It's going to be a fun week!

We're working on a one-click migration from RDS to a dedicated Postgres instance for older projects. Should be live in the next week or so.

ransom15383y ago

I operate a large fleet of mysql db instances. We cannot use Cloudsql (RDS competitor) due mainly to cost. BUT, one thing left out, was the ability to have complex topologies. EG. MasterA <- SlaveA[1..n] <- MasterB <- SlaveB[1..n]. With extremely high writes, being able to cut and shard where you want if very powerful. In this example you could write to MasterB with different data. If i need to filter a table in replication: done. We don't need to beg AWS RDS team for the option to change a db variable (I have done this). Warning: Doing this stuff at scale with massive bills is very stressful. It took about a year to get everything ironed out [snapshots, autoscaling, sharding, custom monitoring, etc].

KaiserPro3y ago

In this instance I can see the point, being able to give raw access to customer's own psql instance is a good feature.

but. It sounds bloody expensive to develop and maintain a reliable psql service on k8s

geggam3y ago

I would love to see the monitoring on this.

Network IOPs and NAT nastiness or disk IO the bigger issue ?

mp3tricord3y ago

In a production data base why are people executing long running queries on the primary. They should be using a DB replica.

stunt3y ago

What's the benefit of running Postgres in Kubernetes vs VMs (with replication obviously)?

maxyurk3y ago

did you consider https://www.pgbouncer.org/ ?

j / k navigate · click thread line to collapse

145 comments

83 comments · 19 top-level

radimm3y ago· 19 in thread

Having recently heard a lot of about PostgreSQL in Kubernetes (cloudNativePG for example) it always makes me wonder about the actual load and the complexity of the cluster in the question.

> This is the reason why we were able to easily cope with 2M+ requests in less than 24h when Midnight Society launched

[edit] PS: not really ditching neither k8s pg nor AWS RDS or similar solutions. Just being curious.

Nextgrid3y ago

> 23 req/sec (guess peak 60 - 100 might be already stretching it)

That kind of load is something a decent developer laptop with an NVME drive can serve, nothing to write home about.

It is sad that the "cloud" and all these supposedly "modern" DevOps systems managed to redefine the concept of "performance" for a large chunk of the industry.

jerf3y ago

2 more replies

mhuffman3y ago

Many frameworks that make early development easy fuck you later during growth with ORM calls, tons of unnecessary text in the DB, etc.

3 more replies

rrampage3y ago

2 more replies

singron3y ago

RDS tops out at about 18000 IOPS since it uses a single ebs volume. Any decent ssd will do much better. E.g. a 970 evo will easily do >100K IOPS and can do more like 400K in ideal conditions.

You can get that many IOPS with aurora, but the cost is exorbitant.

3 more replies

ayende3y ago

You are off by a couple of orders of magnitude

I have run 500+ req/sec on a raspberry pi using 4 TB dataset with 2 GB of RAM, with under 100ms for the 99.99 percentile

A few hundreds req a second is basically nothing.

derefr3y ago

Depends on the queries. Point queries that take 1ms each? Sure. Analytical queries that take 1000ms+ each? Not so much.

StreamBright3y ago

c2h5oh3y ago

That kind of a load you can handle on spinning rust without breaking a sweat.

eptcyka3y ago

NVME? You can serve this from a raspberry pi.

xani_3y ago

It's essentially just a process running in a cgroup so performance shouldn't be all that different than bare metal/VM postgresql.

Main difference would be storage speed and how it exactly is attached to a container.

kccqzy3y ago

> This is the reason why we were able to easily cope with 2M+ requests in less than 24h

I thought this was referring to 2M+ requests per second over a ramp period of 24h, not 2M requests per 24h?

XCSme3y ago

2M+ requests per day can be handled on a pretty cheap VPS even by MySQL, but it depends on the request complexity and, more importantly, the database size.

brand3y ago

radimm3y ago

remram3y ago

> O(TBs) and O(10^4 TPS)

What does this syntax mean? Surely you wouldn't use big-o notation with a constant in it, especially to convey the same meaning as the thing without the O?

1 more reply

MuffinFlavored3y ago

> Having recently heard a lot of about PostgreSQL in Kubernetes

speedgoose3y ago

It does but it’s minimal. Especially compared to the high latency and low throughout network volumes provides (which are the defaults on cloud VMs).

ahachete3y ago

In case you are interested, I blogged about it last year: https://thenewstack.io/kubernetes-will-revolutionize-enterpr...

TL;DR performance impact should be negligible, could be even slightly negative compared to a VM (when running K8s on bare metal).

1 more reply

qeternity3y ago· 11 in thread

These threads are always full of people who have always used an AWS/GCP/Azure service, or have never actually run the service themselves.

ftufek3y ago

manfre3y ago

Have you tested the backup/recovery for any of the DBs yet? I'm curious to hear how that went.

2 more replies

5Qn8mNbc2FNCiVV3y ago

qeternity3y ago

Yes as I mention in an earlier comment we use Patroni and love it.

api3y ago

They seem like a huge step up from the arcane "1980s Unix" nightmare of Postgres clustering but I don't hear about them that much. Are they not used much or are their users just happy and quiet?

(These are all "NewSQL" databases.)

sgtfrankieboy3y ago

We have multiple CockroachDB clusters, have been for 4+ years now. From 2TB to 14TB in used size, the largest does about 3k/req sec.

We run them on dedicated hosts or on Hetzner cloud instances. We tested out RDS Postgres, but that would've literally tripled our cost for worse performance.

Only had a few hiccups with the big cluster but they were resolved quickly with their support.

We're very happy with the product, and have leaned quite a few optimization tricks to get the best out of it. Easy to use as well, join the nodes and it just works.

It's not perfect though, we've had quite a few issues with deleting lots of data at once, it doesn't like that. So we have to do deleted in smaller chunks.

KronisLV3y ago

> I wonder how many people use things like CockroachDB, Yugabyte, or TiDB?

TiDB is a pretty interesting project, but there are a few limitations that should be taken into account when trying to use it: https://docs.pingcap.com/tidb/stable/mysql-compatibility

tluyben23y ago

We are testing all our software on yugabyte now to see how well it works. The cockroach license makes it not a fit for us, so we decided to try Yuga. So far works very well for our workloads.

belmont_sup3y ago

New user of cockroach. We’ll find out! If this startup ever makes it to any meaningful user sizd

9887473y ago

qeternity3y ago

Yes as I mention in an earlier comment we use Patroni and love it.

xwowsersx3y ago· 8 in thread

notac3y ago

I'd recommend hopping off of t3 asap if you're searching for performance gains - performance can be extremely variable (by design). M class will even you out.

General storage IOPS is governed by your provisioned storage size. You can again get much more consistent performance by using provisioned IOPS.

Feel free to email me if you want to chat through things specific to your env - email is in my about:

aeyes3y ago

brazzledazzle3y ago

1 more reply

xwowsersx3y ago

Thank you so much, will definitely take you up on the offer.

Nextgrid3y ago

It's hard to say without metrics; what does your CPU load look like? In general, unless your CPU is often maxing out, changing the CPU is unlikely to help, so you're left with either memory or IO.

Unused memory on Linux will be automatically used to cache IO operations, and you can also tweak PG itself to use more memory during queries (search for "work_mem", though there are others).

For write-heavy workloads, you will need to upgrade IO; there's no way around the "provisioned IOPS" disks.

xwowsersx3y ago

Thanks very much for the reply. CPU is not often maxing out. Here's a graph of max CPU utilization from the last week https://ibb.co/tzw5p3L

1 more reply

paulryanrogers3y ago

General storage IOPS scales with disk size, roughly and to a point. It's often cheaper and faster to increase the instance storage than move to EBS, prioritized or not.

Of course if you need to recover quickly in a disaster you'll want a hot standby or replica. Still may be cheaper than PIOPs. (Especially if you need HA anyway.)

xwowsersx3y ago

Thanks for the reply.

> It's often cheaper and faster to increase the instance storage than move to EBS, prioritized or not.

You're saying it may well be cheaper to increase storage in order to get more IOPS than moving to an EBS-optimized instance type?

Regarding HA, not relevant for at this point (assuming I understood you correctly). We've only got a single primary and one replica, the latter being used primarily for analytics.

nunopato3y ago· 7 in thread

(Nhost)

Sorry for not answering everyone individually, but I see some confusion duo to the lack of context about what we do as a company.

Some points I see mentioned below that are worth exploring:

- One RDS instance per tenant is prohibited from a cost perspective, obviously. RDS is expensive and we have a very generous free tier.

- Yes, this is a DIY scenario, but this is part of our core business.

I hope this clarifies some of the doubts. And I expect to have a more detailed and technical blog post about our experience soon.

akrymski3y ago

If I was launching a BaaS I wouldn't touch AWS. Grab a few Hetzner bare metal servers and setup your infra. You're leaving a massive profit margin to AWS when you don't have to.

fmajid3y ago

Are you using a Kubernetes PostgreSQL operator like pgo or CloudNativePG?

https://proopensource.it/blog/postgresql-on-k8s-experiences

SomaticPirate3y ago

1 more reply

cloudbee3y ago

SOLAR_FIELDS3y ago

RDS and similar managed databases are over half of our total cloud bill at my place of work. Managed databases in general are really expensive.

1 more reply

nunopato3y ago

We haven't open-sourced any of this work yet but we hope to do it soon. Join us on discord if you want to follow along (https://nhost.io/discord).

jrockway3y ago

I'm guessing that they're betting that they can put X idle customers on one machine, and so pay X/machine cost for their free tier.

0xbadcafebee3y ago· 7 in thread

Ah, the 'ol sunk cost fallacy of infrastructure. We are already investing in supporting K8s, so let's throw the databases in there too. Couldn't possibly be that much work.

suggala3y ago

AWS RDS is 10x slower than BareMetal MySQL (both reads and writes). Slowness is mainly due to the reason that Storage is over network for RDS.

Not bad to invest some extra time to get better performance.

You are falling to “Appeal to antiquity” fallacy if you think something old is better.

Nextgrid3y ago

It's unlikely running it on K8S (which is itself going to run on underpowered VMs with networked storage) is going to help.

If you're gonna spend effort in running Postgres manually, do it on bare-metal and at least get some reward out of it (performance and reduced cost).

2 more replies

0xbadcafebee3y ago

1 more reply

gw993y ago

dijit3y ago

And when it all goes bottoms up it will be much more difficult to resolve.

throwawaymaths3y ago

Probably there is also an 80/20 for most users where it's not awful if you can restore from a cold storage, say 12h, backup.

baq3y ago

Fortunately Postgres doesn’t do that often by itself. It usually needs some creative developer’s assistance.

1 more reply

jmarbach3y ago· 3 in thread

$0.50 per extra GB seems high, especially for a storage-intensive app. Given the cost of cloud Object Storage services it doesn't seem to make much sense.

Examples of alternatives for managed Postgres:

* Supabase is $0.125 per GB

* DigitalOcean managed Postgres is ~$0.35 per GB

claytongulick3y ago

I really wish I could use DO, but unless something has changed recently, they don't support delta backups, which is a deal killer for me.

For small startups, my DR/HA plan is hourly delta snapshots of the whole volume.

GCP, AWS and Azure all make this possible.

makestuff3y ago

SUpabase runs on AWS so they are either losing a ton of money, have some amazing deal with AWS, or the $0.50 is inaccurate.

kiwicopple3y ago

(supabase ceo)

EBS pricing is here: https://aws.amazon.com/ebs/pricing/

I'd have to check with the team but I'm 80% sure we're on gp3 ($0.08/GB-month).

MBCook3y ago· 2 in thread

So they switch from one giant RDS instance with all tenants per AZ to per-tenant PG in Kubernetes.

So really we don’t know how much RDS was a problem compared to the the tenant distribution.

For the purposes of an article like this it would be nice if the two steps were separate or they had synthetic benchmarks of the various options.

But I understand why they just moved forward. They said they consulted experts, it would also be nice to discuss some of what they looked or asked about.

raffraffraff3y ago

eptcyka3y ago

Yes, that's basically the whole point of the article - an assumption they may not have made all too consciously to use RDS turned into a bad decision they sought to rectify.

xyzzy_plugh3y ago· 2 in thread

If the cost of operating a postgres database is eating into your margins so much (and you can't simply adjust your prices to eat the difference) then I would suspect the wrong technology is in place.

Sure, RDS is expensive, but it's also quite well done. Almost every cloud platform service is more expensive than doing it yourself. No surprise here.

In the past I've deployed SQLite over Postgres for cost cutting reasons. It's not too difficult to swap out unless you're heavily bought into database features.

movedx3y ago

> Almost every cloud platform service is more expensive than doing it yourself. No surprise here.

In a business environment, this is actually not true unless you consider the extreme long term.

US$7,000 wouldn't get you my services for the time needed to setup a service that came even 30% as close in terms of reliability, feature parity and support.

RDS is not expensive (in the right environment.)

xyzzy_plugh3y ago

neilv3y ago· 1 in thread

I didn't see "backups" mentioned in that, though I'm sure they have them. Depending on your needs, it's a big thing to keep in mind while weighing options.

nunopato3y ago

Thanks for bringing this up. We do have backups running daily, and we will have "backups on demand" soon as well.

qubit233y ago· 1 in thread

I was hoping to see a bit more of an explanation of how this was implemented.

elitanOP3y ago

We need a follow up: *How* we're running thosands of Postgres databases in Kubernetes.

HL33tibCe73y ago· 1 in thread

Couldn’t you just spin up an RDS instance for each project (so, single-tenant RDS instances) to avoid the noisy neighbour problem? Or is that too expensive?

elitanOP3y ago

We could, yes. But way to expensive compared to our current setup.

We're offering free projects (Postgres, GraphQL (Hasura), Auth, Storage, Serverless Functions) so we need to optimize costs internally.

techn003y ago· 1 in thread

So what solution did you end up using? Crunchy operator?

nesmanrique3y ago

We evaluated several operators but at the end decided it would be best to deploy our own setup for the postgres workloads instead using helm.

e-clinton3y ago· 1 in thread

Congrats on the launch. Curious to see what else is in store for this week.

Do I have to manually upgrade my old instances?

elitanOP3y ago

Thank you. It's going to be a fun week!

We're working on a one-click migration from RDS to a dedicated Postgres instance for older projects. Should be live in the next week or so.

ransom15383y ago

KaiserPro3y ago

In this instance I can see the point, being able to give raw access to customer's own psql instance is a good feature.

but. It sounds bloody expensive to develop and maintain a reliable psql service on k8s

geggam3y ago

I would love to see the monitoring on this.

Network IOPs and NAT nastiness or disk IO the bigger issue ?

mp3tricord3y ago

In a production data base why are people executing long running queries on the primary. They should be using a DB replica.

stunt3y ago

What's the benefit of running Postgres in Kubernetes vs VMs (with replication obviously)?

maxyurk3y ago

did you consider https://www.pgbouncer.org/ ?

j / k navigate · click thread line to collapse