That's frightening they don't know that. So, burn this into your minds: no ordering is guaranteed in SQL unless an ORDER BY is applied to it (and any ordering in a subquery is lost). Even if it seems to work, it will fail. No guarantees on order unless... scratch that onto your eyeballs so you never forget it.
Also, will people please stop posting rainbow-on-black screenshots, especially with the screenshotted text size smaller than the main text.
Or even better: have it as a connection/session parameter.
interesting! I like it.
> or appends a "random()" clause to an existing "order by" clause
That one I don't understand. If the Order By is there, why randomise it? I guess
order by p, q, r, random()
then it sort of allows you to find out if it's been ordered sufficiently for your needs - is that it?It's a common problem when moving from Mysql. We personally ran into that shortly after switching to Postgres, but seems like something they should've discovered a lot earlier for sure.
It would never have occurred to me to even try this, I would have assumed postgres would do as well as could be done at what looks like a very standard join. So I guess that's a lesson not to make assumptions?
Not sure, without even having a theory of why pg would behave this way, I'd be worried that as the data changes it might regress again. Or if the phase of the moon changes or something -- since it seems like magic!
In this specific case, I’d bet that having the FROM and JOIN tables reversed would be enough to get even better performance than 2 queries: `SELECT * FROM os JOIN o …`.
Frankly, it's such a simple case that I'm hesitant to pin the problem on Postgres. I'm now inclined to believe that the ORM/query builder they're using (which I'm not familiar with) is generating a query that does something unusual. I'm also having a hard time understanding the code, so it may just be a bad query in the first place. If the generated SQL was shown, it would be much easier to look into.
Generally speaking, if you're just joining some data onto the results of an expensive operation, it's exactly the same as doing a second query, just without the round trip and overhead.
The lesson here is to always look at EXPLAIN ANALYZE any time you're doing Postgres optimization work. There's a wealth of useful information that will often point you to your mistake. Databases are quite often extremely good at doing common-sense operations well.
I can certainly see it when running into memory limits per table, because doing the full join might require more memory at once, but it frequently happens far below where I'd expect that to occur (in the low thousands of rows). Dunno why. Maybe it uses caches more efficiently? Many simple operations are often more cache-able than a few complex ones.
That and the automatic stemming and lemming of search words even in phrase searches makes postgres awful for any software where accurate search is critical.
Oh well. The vast majority of forum posts don't hit that limit so I just excluded the exceptionally long posts from the index. I never revisited it again.
But Asian languages such as Thai, Japanese, Korean, etc are not going to work at all.
PostgreSQL is weird about joins. Joining on certain columns could be super fast but others dog slow. And this can flip depending on size of table and this index Cardinality.
That’s why it’s important on databases that grow quickly to check the performance of even simple queries as those can balloon in execution time as the profile of the data changes.
It’s however a database within a database. PostgreSQL does not natively text search on these kinds of languages.
I understand that there are likely to be many significant differences in graphemes in these languages from Roman text but I'm not familiar enough with any Asian language to construct an example. Can you give an illustrative example that explains why the search doesn't work as well (or at all) in those cases?
Yes we join on indexed integers too.
The thing is, we believe as the amount of data changes the planner will give different instructions. Even after lots of analyzing and vacuuming.
It’s happened so many times and so randomly and because of growth we really have to re-examine joins that were very fast 6 months ago that are now really slow.
You say that like it’s a bad thing. If the hunch works does that cheapen the effect? The why can come after. It often does.
Mastery isn’t pondering things faster, it’s using system 1 thinking for most of the process with some system 2 sprinkled on top. Or to use different terminology, intuition with a bit of high level executive function guiding things.
Or a lot of hunches and a little thinking.
Postgres should be able to do that join in a single query faster than two queries with the latency in between.
So why isn't it?
I use postgres FTS heavily on large datasets, have done a lot of performance tuning, so I was excited by the title. Unfortunately the article failed to deliver any useful information.
“From a technical perspective, ZomboDB is a 100% native Postgres extension that implements Postgres' Index Access Method API. As a native Postgres index type, ZomboDB allows you to CREATE INDEX ... USING zombodb on your existing Postgres tables. At that point, ZomboDB takes over and fully manages the remote Elasticsearch index and guarantees transactionally-correct text-search query results.”
https://docs.amplify.aws/cli-legacy/graphql-transformer/sear...
iirc pgbouncer saved the day there.
Otherwise it worked fine.
select * from query1 as q where exists ( select * from query2 as q2 where q.col = q2.col )