undefined | Better HN

0 pointsDeadron2y ago0 comments

This still doesn't really make sense to me. You can't scale an application that relies on a database heavily to this level because your fundamental constraint IS the database. If you are already hitting your max number of connections with a small number of applications there are no benefits to further horizontal scaling. You are just passing the buck around because only a limited number can hit the database at any one time.

0 comments

18 comments · 8 top-level

brazzy2y ago· 6 in thread

The point is that if you have 50 k8s pods that each have their individual connection pool, some of them will be holding idle connections while others are hitting their max connection limit. A single pool is much more flexible.

Additionally, the "transaction" mode of PgBouncer can increase the utilization of connections further by making them available to a different application when one application holds a connection while doing something different (e.g. waiting for a call to an external service).

adontz2y ago

If one has CPU load for 50 pods, then it is not a usual project I guess? Average load should be thousands of requests per second I believe.

So the question is how many applications have CPU load for 50 pods but also cannot saturate a single database, so sharding is not considered yet? Gut feeling is that they are few, more like exceptions.

From my personal experience pgBouncer is used a lot when framework/library does not support application level pooling in the first place, and not so much when it does.

film422y ago

This. For the web service workloads most of us run, I’ve always run out of postgres connections before exceeding the vertical limit of a postgres server.

DeadronOP2y ago

Is this because your default approach is to horizontally scale or because you have tried other options? Logically scaling vertically a single server with pooled connections should have a better effect than horizontally scaling to multiple servers that need to share connections through a proxy. The proxy doesn't magically increase the number of connections available to the database. It will add overhead to all your database interactions though.

1 more reply

DeadronOP2y ago

Don't spin up 50 pods. You have outscaled your database. You can't make IO operations faster by throwing more workers at it and you can only have so many connections working at once. As a side note if your application is a typical IO bound web app its very unlikely you can process enough transactions to effectively use 50 workers in a single region.

brazzy2y ago

We're not necessarily talking about 50 pods for the same application, could also be a zoo of smaller low traffic applications sharing a Postgres instance.

severino2y ago

In those cases, does one keep the application-level pool at each pod, in addition to the "communal" PgBouncer pool, or does not offer any advantage?

dkhenry2y ago· 2 in thread

The constraint is less often the database in reality then in theory. Expecially when you consider many large scale applications are doing complex things. A normal request trace might only spend 30% or 40% of its time in the DB call. When you consider that a single postgres database can clear 200k QPS, you start to get to a world where you have thousands of hosts. If you tried to tune the in application connection pool to suit the various scale of deployments you would quickly find that having a proxy is both simpler and safer.

I would confidently state that most large scale applications have run into the situation where they have scale up their application worker fleet, and then crashed their database with too many connections. Coordinating the size of the worker connection pool in a world where we have elastic application workers is enough of a task that deploying a proxy is really the simplest and best solution

DeadronOP2y ago

Most large scale web applications spend their time reading and writing data, both to/from clients and to/from other remote services such as databases. You don't need thousands of hosts. Stackoverflow famously ran 9 server instances in 2016 with 7 dedicated to the primary sites.

Unlike postgres, Oracle and Sql server can support thousands of connections but they see performance degradation at a certain point. So I have never seen them crash from too many connections(Although they definitely get slower.).

dkhenry2y ago

Stackoverflow is very much the exception not the rule. Most of your top tier software companies have server fleets that scale well past the 10,000's of nodes level, and for container based workloads I don't think its uncommon to have even medium sized companies running 100k+ containers.

1 more reply

dharmab2y ago· 2 in thread

If you limit yourself to a subset of Postgres' features, connections can become your bottleneck. I work with a production system where the major scaling constraints are 1) the VM capacity of the cloud region it runs in and 2) available connections to Postgres

DeadronOP2y ago

To be clear pgbouncer does not add connections to postgres or remove the connection bottleneck. Its still there under the covers. If you are saturating your connections it will not be able to improve on throughput. It sounds like you need a different architecture to allow for queueing work. The approach pgbouncer takes may actually reduce performance overall as it will intermix work on the pg instance which, if you are already saturating the database, will slow down other operations overall.

dharmab2y ago

Yup, one of the things we're doing is moving parts of the system away from Postgres into queues.

film422y ago

The real problem is working with postgres’ “each connection is a process” model. Pgbouncer puts an evented pool in front of postgres to deal with this. Apps that are aggressive with DB will not benefit from having an evented pool in front. However, web apps (think rails) will have connections checked out even if they don’t need them. Pgbouncer helps here. If your app recycles DB connections when not in use, that leads to connection thrashing and higher latency, which pgbouncer can help with. But you’re right that at some point, the DB is the bottleneck. For most people, it’s the number of connections, because postgres makes a process for each connection.

aeyes2y ago

> You can't scale an application that relies on a database heavily to this level because your fundamental constraint IS the database.

I pool ~10.000 connections down to under 500. I can't do application level pooling because the application is just thousands of individual processes, you often see this with Python, PHP or NodeJS applications.

500 open connections is way less overhead than 10.000 on the Postgres server. I'm very happy to "pass the buck" of connection pooling to separate machines with pgBouncer.

stickfigure2y ago

If you have 10 application instances each with a pool of 20 connections (half of which are idle), you have 100 active connections and 100 idle connections.

If you have a single "real" connection pool, the idle pool is effectively shared among all application instances. You can have 100 active connections and 10 (or maybe 20) idle.

I have run into this problem, but I solved it with careful tuning of automatic instance scaling parameters, connection pool size and timeout. But this only gets you so far; a single real connection pool would be more effective (at the cost of added complexity).

aidenn02y ago

I think it's an issue with a microservice setup. TFA suggests ~300 connections being optimal for Postgres and I've seen microservice setups which have more than 300 processes, which means even limiting each process to a single DB connection might not be enough.

But yeah, for less distributed applications, just have N worker threads and don't close the DB connection after each job.

perrygeo2y ago

> You can't scale an application that relies on a database heavily to this level because your fundamental constraint IS the database.

That's a big assumption. If your app is just an HTTP wrapper for SQL queries against a poorly-optimized database, sure.

But there are plenty of applications that spend time the majority of their time doing other stuff (http requests, FFI calls, g/cpu-intensive processing, etc.) where the database interaction is a small part of the overall performance profile - certainly not the bottleneck, even with hundreds of concurrent clients.

In those cases, rather than artificially throttling the concurrency to stay within your global max_connections, you would ideally run as many as you can (ie fully utilize your infrastructure) and grab database connections from an external pool when you need them.

j / k navigate · click thread line to collapse

0 comments

18 comments · 8 top-level

brazzy2y ago· 6 in thread

adontz2y ago

If one has CPU load for 50 pods, then it is not a usual project I guess? Average load should be thousands of requests per second I believe.

From my personal experience pgBouncer is used a lot when framework/library does not support application level pooling in the first place, and not so much when it does.

film422y ago

This. For the web service workloads most of us run, I’ve always run out of postgres connections before exceeding the vertical limit of a postgres server.

DeadronOP2y ago

1 more reply

DeadronOP2y ago

brazzy2y ago

We're not necessarily talking about 50 pods for the same application, could also be a zoo of smaller low traffic applications sharing a Postgres instance.

severino2y ago

In those cases, does one keep the application-level pool at each pod, in addition to the "communal" PgBouncer pool, or does not offer any advantage?

dkhenry2y ago· 2 in thread

DeadronOP2y ago

dkhenry2y ago

1 more reply

dharmab2y ago· 2 in thread

DeadronOP2y ago

dharmab2y ago

Yup, one of the things we're doing is moving parts of the system away from Postgres into queues.

film422y ago

aeyes2y ago

> You can't scale an application that relies on a database heavily to this level because your fundamental constraint IS the database.

500 open connections is way less overhead than 10.000 on the Postgres server. I'm very happy to "pass the buck" of connection pooling to separate machines with pgBouncer.

stickfigure2y ago

If you have 10 application instances each with a pool of 20 connections (half of which are idle), you have 100 active connections and 100 idle connections.

If you have a single "real" connection pool, the idle pool is effectively shared among all application instances. You can have 100 active connections and 10 (or maybe 20) idle.

aidenn02y ago

But yeah, for less distributed applications, just have N worker threads and don't close the DB connection after each job.

perrygeo2y ago

> You can't scale an application that relies on a database heavily to this level because your fundamental constraint IS the database.

That's a big assumption. If your app is just an HTTP wrapper for SQL queries against a poorly-optimized database, sure.

j / k navigate · click thread line to collapse