When I first saw PRQL on Hacker News a few months ago, I was immediately captivated by the idea, yet equally disappointed that there was no integration for PostgreSQL. Having previous experience with writing PostgreSQL extensions in C, I thought this was a great opportunity to try out the pgrx framework and decided to integrate PRQL with PostgreSQL myself.
The maintainers of both PRQL and pgrx were very nice to work with. Thanks guys.
Short answer: DX
Slightly longer answer: Developer productivity and experience, especially for EDA and interactively writing complex analytical queries.
Most people that have tried PRQL just find it more convenient to write their analytical queries in it. PRQL compiles to SQL so it can't express anything you can't already do in SQL, but you can probably express yourself much faster in PRQL.
Just try the following query in the online PRQL Playground (https://prql-lang.org/playground/) to find the longest track per album:
```prql
from tracks
group album_id (
sort {-milliseconds}
take 1
)
```How long would it take you to write the SQL for that?
Disclaimer: I'm a PRQL contributor.
from tracks
group album_id (
sort {-milliseconds}
take 1
)
Editorializing:Data query specification is all about getting the details right. This does not look simpler than the corresponding SQL to me though. All components must be present -- scope, group, limit, order.
SQL, for all its faults, is generally succinct at incorporating the required details. The PRQL sample here is succinct as well, but to me at least, not differentiating.
I don't want to appear rude, but unless I'm missing something, this is a pretty simple SQL query, of the kind anyone with mimimal SQL experience could write off the top of their head in seconds.
I like the idea of PRQL, but I think a better example is needed to sell it.
This idea seems like it'd be better as an editor plugin that lets you write shorthand and have it automatically expanded into correct SQL rather than as a build time thing.
select distinct on (album_id) *
from tracks
order by album_id, milliseconds desc;That should work
Look I’m a diehard SQL just use it guy but open to improvements. But I’m loathe to use abstractions for things when the underlying thing is so expressive.
Autocomplete of fields in a good editor, schema help, etc go a long way to making SQL being written raw very nice.
select
album_id
,first_value(id order by milliseconds desc) as longest_track_id
from tracks
group by album_id
I agree the PRQL's pretty nice here, but I think such a generalised example (chances you actually want to `select *`?) overstates the advantage.Currently much of my complicated SQL is generated by a LLM.
In most languages it's easy to pull out functions.
In SQL you end up with a giant hard to comprehend mess.
I think the underlying relational concepts in SQL are sound but I'd love to see ideas like PRQL that aim make SQL easier to write and maintain.
Stored procedures and functions are nice but don't allow the basic idea of breaking a large query apart into smaller logical components.
The thing that's lacking right now is the tooling for managing/testing/deploying database code. There are solutions out there and the supabase folks have been working to make things better but database first development still has some hurdles in terms of DX.
It's like complaining that you have to know the variable name you're going to assign something to before you start writing the expression that will set the value.
int y = x * 2;
The idea is to evaluate the expression and store it, but the expression doesn't actually read that way left-to-right. Wouldn't it make more sense for it to be: x * 2 assign to new int y;
Technically, that's written more in execution order. In practice it just isn't that big of a deal. It only trips up beginners.SQL is also quite verbose in places (JOINs are the most trivial example), and lack a decent amount of abstraction (CTEs are relatively low level).
Updating a large set of FK'd tables can be a nightmare (this is what ORMs shine at).
Finally, some modern additions are quite unreadable, Postgres' JSON syntax, for example.
I'm not saying that PRQL solves any of the above, but these are all legit problems with "plain" SQL.
CTEs are a first step in structuring queries to make them decomposable. You can extract CTEs to functions and mark them stable and it's logically equivalent to the original query.
Whenever I am writing SQL I am not thinking in SQL, but I am thinking in what I consider to be the mathematical sound way, which I translate into SQL while writing. I consider thinking in SQL a much greater mental handicap than having to translate mentally into it.
I would prefer to write directly in what I would consider as a good query language and have it translated automatically into SQL, for compatibility with what is, for unfortunate historical reasons, the standard.
I have not attempted previously to do or use something like this, but work like that discussed here seems like a step in the right direction.
I don't think it's suggested that this replaces SQL. Use the right tool (and abstraction) for the job?
Actually why limit yourself with SQL...?
PRQL is a language compiled into SQL and makes certain hard-to-do things in SQL easy purely because it allows to streamline operations which SQL needs CTE joins or whatever hoop jumping to solve.
My favorite example which sounds easy but isn't - select the row which is MAX(...).
Because it's everywhere, has extensive documentation and tutorials, all database tools support it, all relational engines support it, some non-relational engines support it, all programming languages have library support for it, it can be accessed through command line tools as well as graphical interfaces, etc.
You think an industry is going to give up 50 years of infrastructure because some (typically junior) devs think the syntax is "kinda icky"?
> My favorite example which sound easy but isn't - select the row which is MAX(...).
If you look earlier in the comments you will see queries that have "DISTINCT ON" in them. It solves the problem that sounds easy, but actually is pretty easy if you know SQL.
2. Propose newer, simpler language to take care of these
3. Newer, simpler language lacks features of original language
4. Newer language adds features, making it more complicated
5. GOTO 1
I'm working on a new language that compiles directly to Postgres' post-analysis structs. It's working out pretty well so far, but my chosen "universal set" (aggregation/array/subquery/... as one thing) semantics are sometimes a pain to encode.
For example, I want to have universal broadcasting of operators on subquery results, array values, and aggregated columns. To do this, I need to know which of these the operand expressions represent, which is slow or impossible with transpilation.
PRQL and EdgeQL (EdgeDB) are the most interesting ones to watch how they evolve, though.
I've also written a PG extension to make jq available in Postgres [0]
I believe Postgres, in general, will flourish as a host for DSL languages [1].
0: https://github.com/Florents-Tselai/pgJQ 1: https://tselai.com/pgjq-dsl-database.html
We'll soon be announcing some interesting developments on that front, stay tuned :)
"A jaw-dropping amount of effort has been spent attempting to bridge the gap between the relational paradigm of SQL and the object-oriented nature of modern programming languages. EdgeDB sidesteps this problem by modeling data in an object-relational way."
All the best to the team. I however truly hope this isn't the direction the industry moves toward. I thought we learned our lesson from MongoDB. I still believe data is best modeled in sets, not objects.The solution isn't for databases to become more like object stores but for general purpose programming languages to be more amenable to seamless access of set-oriented data.
More stuff like this:
This is one of the great HN mysteries to me, and if anyone can shed some light on it, it would be much appreciated.
Could be wrong though, just writing this from memory.
It's confusing, but greying out is used on HN for a single purpose: to discourage reading (and therefore writing).
Low-quality comments are greyed by downvotes from other users and moderators.
Text posts (including Ask and Show HN) are greyed automatically.
Ask/Show posters are encouraged to post a comment on their own story, and to let that comment rise or fall according to its up/down votes.
Meta-meta: Your comment might be downvoted for being meta to the post. It looks like it has already been "detached" from the comment tree so that it appears at the bottom instead of responding to up/downvotes. This is actually protective of your karma and this conversation. Your question is valid, but it's not germane to the post. If your comment was allowed to float to its normal location, it would be downvoted by others who considered it off-topic.
Is it a new tool with great new powers or is it just syntactic sugar?
Personally, I was very excited about using it to write some complex queries in my application that does some fancy backtesting with sliding windows etc, but I reverted back to SQL pretty quickly because I found myself first thinking in SQL and translating back to PRQL :/
New syntax is nice, but it means that analysts and engineers need to learn something new and are more likely to make mistakes that could bubble up to production. There's always an argument to be made why shiny new tool XYZ is better, but unless it's 100X better, organizations are reluctant to switch from something like vanilla PostgreSQL that they know works 100% of the time.
The intention is rather for it to be simpler, as it uses a linear direction of data handling. SQL jumps back and forth with its order of operations and can be confusing in this way.
PRQL also has a more modern syntax that reuses more universal concepts with fewer keywords to learn. In contrast to SQL which has a unique keyword, syntax, and behavior for everything.
> Windows is not supported. It could be, but will require a bit of work with cargo-pgrx and figuring out how to compile pgrx's "cshim" static library.
PRQL as a DuckDB Extension - https://news.ycombinator.com/item?id=39130736 - Jan 2024 (47 comments)
PRQL: Pipelined Relational Query Language - https://news.ycombinator.com/item?id=36866861 - July 2023 (209 comments)
Calculate the Digits of Pi with DuckDB and PRQL - https://news.ycombinator.com/item?id=35153824 - March 2023 (1 comment)
One Year of PRQL - a modern language for relational data - https://news.ycombinator.com/item?id=34690560 - Feb 2023 (1 comment)
PRQL: a simple, powerful, pipelined SQL replacement - https://news.ycombinator.com/item?id=34181319 - Dec 2022 (215 comments)
Show HN: PRQL 0.2 – a better SQL - https://news.ycombinator.com/item?id=31897430 - June 2022 (159 comments)
PRQL – A proposal for a better SQL - https://news.ycombinator.com/item?id=30060784 - Jan 2022 (292 comments)
PostgreSQL's query-optimiser does handle these cases quite well for me once I explain and add the appropriate indexes, yet complex source queries carry undiscountable costs (longer planning times, missed optimisations e.g. predicate pushdowns).
I find myself needing to mechanically transform and simplify SQL every now and then, and it hardly seems something out of reach of automation, yet somehow I've never been able to find software that simplifies and transforms SQL source-to-source. When I look, I only find optimisers for SQL execution plans. It's a bit hard to believe that such a thing doesn't exist, given how significant the SQL ecosystem is.
> PRQL allows for powerful autocomplete, type-checking, and helpful error messages (in progress)
Without some kind of autocomplete though I'm a lot less motivated to do so.