Ask HN: What are the cons of PostgreSQL?

37 pointsminionslave10y ago42 comments

Everywhere I go, people are singing the praises of PGSQL. I never hear about the disadvantages. Maybe it's just that good and there's nothing to complain about.

42 comments

41 comments · 21 top-level

pjungwir10y ago· 4 in thread

I've worked with Postgres, MySQL, Oracle, and SQL Server, and Postgres is my go-to choice. I can think of a few cons though:

- Replication is harder to set up than MySQL. And there is no master-master replication like in MySQL. (Of course it also doesn't lose your data.... :-)

- I wish there were support for SQL:2011's bitemporal stuff. (There are some tools/extensions/published patterns to add "transaction-time" aka "system-time" support, but none I know of for "valid-time" aka "application-time", let alone both at once.) But this is very niche and not really fair to expect so quickly from a free product.

- I wish I could share a database between separate users, give them each their own schema, and not let them do `\dn` to see what other schemas exist. Again, very niche.

. . . Okay I thought I would have a longer list but I've run out of ideas. :-)

- EDIT: One more. You can't do `UPDATE ... ORDER BY ...`. This mattered to me once when I had multiple multi-row UPDATEs running at the same time, and if they touched the same rows but in different order, they would deadlock. You could solve that with `ORDER BY id` so that everyone obtained locks in the same order. Apparently this is not a big enough problem for anyone else in the world to care though.

I think the lesson here is that there are no real gotchas, just small annoyances that are very unique to your own project. Unless you have quite unusual requirements, I doubt there is going to be a deal-breaker issue.

jfrisby10y ago

PgCluster, BDR, rubyrep, and Postgres-XL all provide master-master (or analogous) replication schemes. I cannot attest to the robustness / production-readiness of any of them. And yeah, it really is a PITA to set it up with Postgres. I will note that I once[1] lost a whole cluster's data due to a combination of Slony's sharp corners and operational difficulty of Postgres.

The schema visibility thing seems like it might be solvable with row-level security in 9.5? (I.E. apply constraints to the system catalogs, perhaps?)

The UPDATE one is good to know. Did you work around it by doing `SELECT ... ORDER BY id FOR UPDATE`?

[1] - Slony had a problem where global transaction IDs rolling over caused it to do Very Bad Things and we had had to turn vacuuming off because performance during a peak period -- but we forgot to turn it back ON and... So yeah, a combination of sharp corners in Slony and Postgres + user error == our cluster systematically ate itself.

pjungwir10y ago

Ha, I thought about RLS on system catalogs, but it's not supported. Here is a long thread about it: http://postgresql.nabble.com/Multi-tenancy-with-RLS-td586211...

It is also possible to revoke privileges on pg_namespace, but that breaks too many things for my taste (\dt e.g.).

I think I worked around the deadlock issue by detecting the failure in application code and trying again. It was for background workers in a side-project SaaS I abandoned after a few months, so a sloppy fix was very tolerable. I like your SELECT FOR UPDATE idea though!

dragonwriter10y ago

> I wish there were support for SQL:2011's bitemporal stuff.

While there isn't direct support, PostgreSQL's support for range types seems to provide a more-general solution where much of what SQL:2011 temporal support does is a fairly direct application of the more general features (it would be nice to have standards-compliant syntax, though.)

pjungwir10y ago

Have you read Developing Time-Oriented Database Applications in SQL? I am very fond of ranges (actually it's exclusion constraints that are the key addition IMO), but they are not even close to what I'd like to have. I want `PRIMARY KEY` to work on temporal tables. And `REFERENCES` to a temporal table. And `ON DELETE CASCADE` or `ON DELETE SET NULL`. And joins. And be able to do an `INSERT`/`UPDATE`/`DELETE` and have it do the right thing, without writing all the triggers myself.

I don't know how well other databases support all this stuff. I'm not complaining, I'm just describing my vision for what temporal support would look like. There is a huge amount of stuff the database could be doing for you automatically.

ngrilly10y ago· 4 in thread

In my experience, PostgreSQL misses just a few things compared to MySQL:

- Tables cannot be organized as a clustered index (this can be useful to control rows ordering at the storage layer).

- No builtin query cache (but I'm not sure it's very useful).

- It's easier to find hosted MySQL services like Amazon RDS Aurora and Google Cloud SQL.

- YouTube Vitess for MySQL is really great. I don't know if there is something similar for PostgreSQL.

I won't write about the endless list of PostgreSQL's advantages since it wasn't the question ;-)

NatW10y ago

FYI Amazon RDS now supports Postgresql: https://aws.amazon.com/rds/postgresql/

ngrilly10y ago

You're right. My comment was misleading. Amazon RDS supports PostgreSQL. What I meant is that it would be great to have Aurora' scalability and performance with PostgreSQL instead of MySQL.

bsg7510y ago

> - Tables cannot be organized as a clustered index (this can be useful to control rows ordering at the storage layer).

Although not actively maintained by the engine (you have to recluster on a schedule): https://www.postgresql.org/docs/current/static/sql-cluster.h...

ngrilly10y ago

Yes, but the table is locked during reclustering.

fiedzia10y ago· 4 in thread

First, postgres really is great. The few things I'll list here do not make me willing to replace it with something else, however:

1. Tooling is not on par with other databases. If you want nice GUI that holds your hand and does everything with one click - you may find offer for postgres lacking here. I don't care much about that, but some people do.

2. There is no "in memory" storage (but you can use in memory fs to get the same effect).

3. Extensions often define their own operators (that's good), which are made of some combination of +-*/>=~!@#%^&|`?. Yes. Your code may look like its perl. You can make "select foo @@>&!|~ bar from baz" to be valid query.

4. Finding hosting for postgres in not a problem, but its not something as universal as mysql.

Overall, it really is solid database. I've never been disappointed by postgres, while mysql is always a walk through a forest of wtf for me.

stray10y ago

2. There is no "in memory" storage (but you can use in memory fs to get the same effect).

    # Wouldn't this work?
    fsync=off
    synchronous_commit=off
    full_page_writes=off
    bgwriter_lru_maxpages=0

fiedzia10y ago

Hm, I've never tried that, might work. I've needed this for running tests, where I really don't care for reliability, so I've just set tablespace on ram fs. Creating new files would still cause fs writes, but it might not be an issue.

lastofus10y ago

I believe data would still be written out to disk (unreliably) during checkpoints with this config.

1 more reply

smt8810y ago

> If you want nice GUI that holds your hand and does everything with one click - you may find offer for postgres lacking here

DataGrip, DBeaver (free), HeidiSQL (free)

tmaly10y ago· 2 in thread

For me the only con is just upgrading when there are patches etc.

There are so many upsides to PostgreSQL. I build my food app on redis as I did not know what my schema was going to be. That worked great, but simulating the power of what you get with PostgreSQL is that much harder.

I am in the process of converting things back to PostgreSQL now that my schema has settled. The one thing I will probably keep in Redis is the geo location code as this feature works super well.

1515510y ago

Have you seen PostGIS?

ramtatatam10y ago

If I was to vote then updates are definitely on top of my ranking.

emailgregn10y ago· 2 in thread

Select count(*) takes a surprisingly long time, even on an indexed table. Apparently thus is because of the MVCC model and row count estimates are available to mitigate.

lastofus10y ago

My understanding is that this problem was alleviated with index-only scans in Postgres 9.2. Using these, each tuple no longer needs to be visited to check for visibility.

https://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9....

malisper10y ago

Index only scans work in some cases but not all. In order for an index only scan to be more efficient than a regular index scan, the visibility map needs to be up to date. This is a problem because (currently) the only thing that updates the visibility map is the vacuum and the autovacuum only runs after a large number of updates/deletes happen on a table. If your table is largely read/insert only, the only way to keep the visibility map up to date is to manually call VACUUM on your tables periodically which is a huge pain.

collyw10y ago· 2 in thread

Doesn't have a bitcount function unlike MySQL, which I ended up needing for one problem.

jfrisby10y ago

Not built in, but you can do this:

  CREATE OR REPLACE FUNCTION bitcount(i integer) RETURNS integer AS $$
  DECLARE n integer;
  DECLARE amount integer;
    BEGIN
      amount := 0;
      FOR n IN 1..16 LOOP
        amount := amount + ((i >> (n-1)) & 1);
      END LOOP;
      RETURN amount;
    END
  $$ LANGUAGE plpgsql;

collyw10y ago

The reason for a bitcount was to check DNA sequences very quickly. Using a loop rather than a single CUP operation slowed it down a fair bit.

eliaspro10y ago· 1 in thread

No support for temporal data as in SQL:2011.

There were some attempts but none made it so far. See also: https://wiki.postgresql.org/images/6/64/Fosdem20150130Postgr...

dragonwriter10y ago

> No support for temporal data as in SQL:2011.

Clearly, it doesn't have support for SQL:2011 temporal syntax, but with range types and exclusion constraints, it seems to have fairly good support for doing the things you can do with SQL:2011's temporal features.

(And, having dealt with business data where multiple business [application] date ranges of relevance applied, I think that SQL:2011's "bitemporal" approach which provides special syntax for a single each system of system and application date/time range is in some respects creating an overly-specialized solution to a too-narrow subset of the temporal data problem.)

spacemanmatt10y ago· 1 in thread

Sometimes I wish PostgreSQL supported a shortcut syntax like Oracle's CONNECT BY. It's so much more compact than the equivalent CTE.

Oracle's installed base is the main reason my current gig is highly dependent on it. In so many cases, we have to integrate using Oracle "datapump" configuration. We'd look silly not running Oracle.

dragonwriter10y ago

> Sometimes I wish PostgreSQL supported a shortcut syntax like Oracle's CONNECT BY.

The commercial PostrgreSQL derivative from EnterpriseDB does support CONNECT BY, but since Oracle compatibility seems to be a big part of what they sell and SQL standards compliance is a big focus of the core Postgres product, I don't see it likely that that would get contributed back to the core. (And CONNECT BY is, IIRC, substantially more limited than standard recursive CTEs, so not having it, while it does hurt migration from Oracle's proprietary dialect, avoids people heading down the CONNECT BY path and then being forced to burn it down and start over with standard CTE syntax when they run into a wall. So there's a case that its better just to have standard CTEs.)

joshuak10y ago

I have found it unnecessarily distracting to evaluate reputable databases on incidental features. If an important feature is missing it will be added eventually, if you need a very specific feature then you already have your answer as to which DB to use. Unless you have a very specific reason not to use PostgreSQL as your SQL database then you should just use it.

I simplify the selection to the more general architectural value. Do I need SQL, NoSQL, or graph? What about deep JSON queries, embeddability, specific performance requirements, scalability?

Given that, I use PostgreSQL as the default choice for SQL/JSON db in all cases that don't require imbedding. SQLite for imbedding.

There is one big disadvantage, the same for all fixed schema index based databases. Scalability. Engineering effort rapidly goes to infinity as we asymptotically approach the throughput event horizon.

JelteF10y ago

The only thing I missed in Postgres after switching from MySQL was that text was not searched case insensitively, i.e. searching for "alex", would not match "Alex". However, this was easily solved by using the CITEXT type, which is exactly for that purpose, but you have to enable it as an extension [1].

Except for that the only thing that was a problem after switching is that Postgres is a bit more strict about some stuff, such as dividing by zore, which throws an error in Postgres but results in NULL in MySQL.

The main reason I love Postgres, and which is also why I switched, are the Database Definition language statements that support transactions (ALTER TABLE). This allows for much better migrations, if for some reason something fails you will not be in a state that is undefined.

[1]: https://www.postgresql.org/docs/9.1/static/citext.html

ForHackernews10y ago

It's not as widely supported as MySQL (though that is changing over time). Until very recently it lacked UPSERT functionality.

combatentropy10y ago

I've used it at work for 11 years. No real complaints. Love it.

Some people making just one app might look at SQLite, simply because it's simpler to set up. But those are my only two suggestions: PostgreSQL or SQLite.

x0x010y ago

Lack of query plan locking. I've had the query heuristics go bad and queries that used to take under 5 ms balloon to several seconds. It would be nice to be able to lock query plans that deliver satisfactory performance for very high volume queries.

It's a great db though, with very few bugs, high quality code, responsive devs, and relatively cheap support plans (like maybe $25k/year to get 24x7 support? I don't remember the numbers exactly) available from enterprisedb.com who have a bunch of core devs. In my one experience they knew their stuff.

crypto510y ago

No built in compression. Really big deal for reducing IO and DB size.

mrbuttons45410y ago

Something relatively minor I've noticed coming from MySQL, you can't change the column order of a table without recreating it.

bsg7510y ago

Compared to the big commercial engines, the one thing I miss is query parallelisim, but the initual stages are coming in v9.6: http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-i...

bpineau10y ago

The stored data format isn't compatible between major versions.

Because of that, upgrades are either stressfull and cumbersome (ie. slony + switchover to a promoted and upgraded slave), or imply a large downtime (ie. pg_upgrade).

Also, because of that (WAL format), you can't use native replication between different major versions of PostgreSQL.

bjourne10y ago

It is not easy to bundle it. You can statically link to SQLite (afair, but you can ship the .so/.dll file to get the same result) and put the database in a "mydb.db" file which makes your program easy to distribute. Postgres requires a server to be running.

Other than that, Postgres is (IMHO, as always) the perfect RDBMS.

YuriNiyazov10y ago

Can't upgrade across major releases without downtime. (Supposedly is on the roadmap for next release or two)

Doesn't allow you to force the use of an index in querying.

Materialized views always do a full rebuild, instead of applying deltas.

No estimates for when a long-running query will complete.

joeclark7710y ago

It annoys me that I can't use uppercase letters in table or column names without wrapping them in nasssty quotation marks.

1515510y ago

Support for distribution (multi-master, etc.)

But this is for all of the right reasons.

j / k navigate · click thread line to collapse

42 comments

41 comments · 21 top-level

pjungwir10y ago· 4 in thread

I've worked with Postgres, MySQL, Oracle, and SQL Server, and Postgres is my go-to choice. I can think of a few cons though:

- Replication is harder to set up than MySQL. And there is no master-master replication like in MySQL. (Of course it also doesn't lose your data.... :-)

- I wish I could share a database between separate users, give them each their own schema, and not let them do `\dn` to see what other schemas exist. Again, very niche.

. . . Okay I thought I would have a longer list but I've run out of ideas. :-)

jfrisby10y ago

The schema visibility thing seems like it might be solvable with row-level security in 9.5? (I.E. apply constraints to the system catalogs, perhaps?)

The UPDATE one is good to know. Did you work around it by doing `SELECT ... ORDER BY id FOR UPDATE`?

pjungwir10y ago

Ha, I thought about RLS on system catalogs, but it's not supported. Here is a long thread about it: http://postgresql.nabble.com/Multi-tenancy-with-RLS-td586211...

It is also possible to revoke privileges on pg_namespace, but that breaks too many things for my taste (\dt e.g.).

dragonwriter10y ago

> I wish there were support for SQL:2011's bitemporal stuff.

pjungwir10y ago

ngrilly10y ago· 4 in thread

In my experience, PostgreSQL misses just a few things compared to MySQL:

- Tables cannot be organized as a clustered index (this can be useful to control rows ordering at the storage layer).

- No builtin query cache (but I'm not sure it's very useful).

- It's easier to find hosted MySQL services like Amazon RDS Aurora and Google Cloud SQL.

- YouTube Vitess for MySQL is really great. I don't know if there is something similar for PostgreSQL.

I won't write about the endless list of PostgreSQL's advantages since it wasn't the question ;-)

NatW10y ago

FYI Amazon RDS now supports Postgresql: https://aws.amazon.com/rds/postgresql/

ngrilly10y ago

You're right. My comment was misleading. Amazon RDS supports PostgreSQL. What I meant is that it would be great to have Aurora' scalability and performance with PostgreSQL instead of MySQL.

bsg7510y ago

> - Tables cannot be organized as a clustered index (this can be useful to control rows ordering at the storage layer).

Although not actively maintained by the engine (you have to recluster on a schedule): https://www.postgresql.org/docs/current/static/sql-cluster.h...

ngrilly10y ago

Yes, but the table is locked during reclustering.

fiedzia10y ago· 4 in thread

First, postgres really is great. The few things I'll list here do not make me willing to replace it with something else, however:

2. There is no "in memory" storage (but you can use in memory fs to get the same effect).

4. Finding hosting for postgres in not a problem, but its not something as universal as mysql.

Overall, it really is solid database. I've never been disappointed by postgres, while mysql is always a walk through a forest of wtf for me.

stray10y ago

2. There is no "in memory" storage (but you can use in memory fs to get the same effect).

    # Wouldn't this work?
    fsync=off
    synchronous_commit=off
    full_page_writes=off
    bgwriter_lru_maxpages=0

fiedzia10y ago

lastofus10y ago

I believe data would still be written out to disk (unreliably) during checkpoints with this config.

1 more reply

smt8810y ago

> If you want nice GUI that holds your hand and does everything with one click - you may find offer for postgres lacking here

DataGrip, DBeaver (free), HeidiSQL (free)

tmaly10y ago· 2 in thread

For me the only con is just upgrading when there are patches etc.

I am in the process of converting things back to PostgreSQL now that my schema has settled. The one thing I will probably keep in Redis is the geo location code as this feature works super well.

1515510y ago

Have you seen PostGIS?

ramtatatam10y ago

If I was to vote then updates are definitely on top of my ranking.

emailgregn10y ago· 2 in thread

Select count(*) takes a surprisingly long time, even on an indexed table. Apparently thus is because of the MVCC model and row count estimates are available to mitigate.

lastofus10y ago

My understanding is that this problem was alleviated with index-only scans in Postgres 9.2. Using these, each tuple no longer needs to be visited to check for visibility.

https://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9....

malisper10y ago

collyw10y ago· 2 in thread

Doesn't have a bitcount function unlike MySQL, which I ended up needing for one problem.

jfrisby10y ago

Not built in, but you can do this:

  CREATE OR REPLACE FUNCTION bitcount(i integer) RETURNS integer AS $$
  DECLARE n integer;
  DECLARE amount integer;
    BEGIN
      amount := 0;
      FOR n IN 1..16 LOOP
        amount := amount + ((i >> (n-1)) & 1);
      END LOOP;
      RETURN amount;
    END
  $$ LANGUAGE plpgsql;

collyw10y ago

The reason for a bitcount was to check DNA sequences very quickly. Using a loop rather than a single CUP operation slowed it down a fair bit.

eliaspro10y ago· 1 in thread

No support for temporal data as in SQL:2011.

There were some attempts but none made it so far. See also: https://wiki.postgresql.org/images/6/64/Fosdem20150130Postgr...

dragonwriter10y ago

> No support for temporal data as in SQL:2011.

spacemanmatt10y ago· 1 in thread

Sometimes I wish PostgreSQL supported a shortcut syntax like Oracle's CONNECT BY. It's so much more compact than the equivalent CTE.

Oracle's installed base is the main reason my current gig is highly dependent on it. In so many cases, we have to integrate using Oracle "datapump" configuration. We'd look silly not running Oracle.

dragonwriter10y ago

> Sometimes I wish PostgreSQL supported a shortcut syntax like Oracle's CONNECT BY.

joshuak10y ago

I simplify the selection to the more general architectural value. Do I need SQL, NoSQL, or graph? What about deep JSON queries, embeddability, specific performance requirements, scalability?

Given that, I use PostgreSQL as the default choice for SQL/JSON db in all cases that don't require imbedding. SQLite for imbedding.

JelteF10y ago

[1]: https://www.postgresql.org/docs/9.1/static/citext.html

ForHackernews10y ago

It's not as widely supported as MySQL (though that is changing over time). Until very recently it lacked UPSERT functionality.

combatentropy10y ago

I've used it at work for 11 years. No real complaints. Love it.

Some people making just one app might look at SQLite, simply because it's simpler to set up. But those are my only two suggestions: PostgreSQL or SQLite.

x0x010y ago

crypto510y ago

No built in compression. Really big deal for reducing IO and DB size.

mrbuttons45410y ago

Something relatively minor I've noticed coming from MySQL, you can't change the column order of a table without recreating it.

bsg7510y ago

Compared to the big commercial engines, the one thing I miss is query parallelisim, but the initual stages are coming in v9.6: http://rhaas.blogspot.com/2015/11/parallel-sequential-scan-i...

bpineau10y ago

The stored data format isn't compatible between major versions.

Because of that, upgrades are either stressfull and cumbersome (ie. slony + switchover to a promoted and upgraded slave), or imply a large downtime (ie. pg_upgrade).

Also, because of that (WAL format), you can't use native replication between different major versions of PostgreSQL.

bjourne10y ago

Other than that, Postgres is (IMHO, as always) the perfect RDBMS.

YuriNiyazov10y ago

Can't upgrade across major releases without downtime. (Supposedly is on the roadmap for next release or two)

Doesn't allow you to force the use of an index in querying.

Materialized views always do a full rebuild, instead of applying deltas.

No estimates for when a long-running query will complete.

joeclark7710y ago

It annoys me that I can't use uppercase letters in table or column names without wrapping them in nasssty quotation marks.

1515510y ago

Support for distribution (multi-master, etc.)

But this is for all of the right reasons.

j / k navigate · click thread line to collapse