Hydra open source (https://github.com/HydrasDB/hydra)
Starting today, we are now offering 14-day free trials of our cloud managed Hydra databases. Click the 'Get Started for Free" button on (https://hydra.so/) to get one.
Power to the Postgres people!
Does that mean every columns (lines) are auto-indexed? Can you use that strategy for storing things like users?
Columnar is not ideal for a `users` table where you want to select and update specific rows, often in very small, quick transactions (OLTP). You would want to continue to use a traditional (heap) table in that case. That's certainly something you can still do with Hydra, and combining both kinds of tables is considered HTAP, and something that is a unique use case of our product.
To contrast, columnar is best for "fact" tables -- data about something that happened (thus it does not change) that will be analyzed in an aggregate way. Those might be logs, events, transactions, etc.
One important recommendation: please do not run benchmarks on variable-performance storage (gp2 in this case), as it obviously may yield different performance depending on the credit situation, potentially delivering more or less performance to different benchmark runs / scenarios.
Specifically, for 500GB you get the max throughput (250MB/s, which BTW is pretty low for an OLAP-style bench, in my opinion) but only 1.5K IOPS. See [1] for more information.
An easy alternative would have been gp3 volumes, which do not burst performance, and can be set to provide 16K IOPS and 1Gbps.
[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/general-...
Great to know this is known. My recommendation still holds: publish results with GP3: whatever others do (potentially, wrong) shouldn't prevent you from doing it right.
I'd be giving a deeper look at the project.
On a related topic: an OLAP benchmark with a small dataset that fits in memory caters only to what I'd consider a small set of OLAP use cases. I'd love to see one with a large dataset much bigger than memory.
Also omits ClickHouse, despite using its benchmarking tooling?
Edit: Oh, because for some reason Redshift isn't considered postgres compatible in ClickBench. Still, 'the fastest' is 'greenplum' according the full thing: https://benchmark.clickhouse.com/#eyJzeXN0ZW0iOnsiQXRoZW5hIC...
Redshift is multi-node, which puts it in a different category -- with considerably higher costs.
[1]: https://github.com/ClickHouse/ClickBench/blob/main/greenplum...
Despite what the Amazon marketing is telling, Redshift is not really a "fork" of Postgres.
To my knowledge they only used the SQL parser and the wire protocol from Postgres.
The optimizer, query executor and storage engine are totally different. The whole "Redshift is Postgres" is complete marketing BS in my opinion.
Anyway, does it really matter? What is someone looking for a fast 'postgres' for analytics actually interested in?
(I didn't realise this was just an extension - in which case I'm amazed it's possible, but that obviously makes it an easy sell if you're already running pg. But if you're shopping about for managed solutions (which is obviously what Hydra wants to sell) with 'postgres' criterium, you're interested in the query language and maybe the wire protocol, surely?)
How does it work for writing, I saw somewhere that the columnar tables are append-only? Are there any plans on storage which can be merge or updated into?
It is transactional, right? That part of Postgres doesn't change with the plugin.
Would you support a business model "install in customer cloud and we provide support" or is your own cloud the only option?
The fact that you guys are supporting DBT natively is a HUGE plus for a data professional like myself.
Will definitely be kicking the tires soon & testing against my own workloads.
As I’ve anticipated, clickbench will become the defacto analytics benchmark, which is great.
Something to improve IMHO it’s documentation regarding Cloud and Open Source. Is somehow mixed (the same happen in ClickHouse docs)
As a Postgres user I would like to see: how can I install this extension in my current Database? How to migrate to new tables.
Timescale makes a really great job in this case.
Anyway, as I’ve said. Congrats and good luck!