Hmmm. Not a big lover of SQL, but this is a bit imprecise. While the unit of composability is slightly smaller for from=>where=>select (expression) vs select+from+where (subquery), in practice they both encode the same fundamental compositional principles, based on relational algebra. Any from=>where=>select query can be translated into a select+where+from query almost 1:1, if only via:
from t ::= select * from t
q => where p ::= select * from q where p
q => select xs ::= select xs from q
Sometimes the select+where+from ends up more verbose, sometimes there is more brain twisting to grok a given select+where+from query, but that's not a composition limiting factor. Granted, the readability of SQL is sometimes lacking, but it is fully capable to compose recursive relational algebra queries.Quite a few query languages are equally capable (including, surprisingly, quite a few so-called graph languages, provided the SQL dialect provides a transitive closure), but SQL as a language has some undesirable properties (most of which are trade-offs for the fact that basic SQL is very easy to parse).
Was QUEL also more usable and natural to people? As well as being more amenable to composition?
retrieve (a=count(y.i by y.d where y.str = "ii*" or y.str = "foo"), b=max(count(y.i by y.d)))
Not a particularly clear 'jumps at you' obvious semantic:* Are a and b aggregation functions or window functions? If aggregations, how do they compose if the 'by' scopes are different?
* What does max(count(... by ...)) mean? What is the aggregation (window?) scope of max?
* How would an outer where clause compose? What is the evaluation order?
In which context, "compostable" is a fantastic Freudian typo.
QUEL:
range of E is EMPLOYEE
retrieve into W
(COMP = E.Salary / (E.Age - 18))
where E.Name = "Jones"
SQL: select (e.salary / (e.age - 18)) as comp
from employee as e
where e.name = "Jones"
I would prefer an operator syntax that directly mimics relational algebra. Something like: w = employee(name == "Jones")[comp = salary / (age - 18)]
So () is "where", [] is "project" (choose or create columns) and you can use * for join and + for union. The result is a table with a column named comp.One of my major complaints about SQL is the syntax is so finicky that it is really hard to replace it with a [something -> sql] layer, because the something layer can't generate all the silly syntactic forms that SQL uses.
Eg, personal favourite, it is easy to have a dsl that translates
select(y = fn(x)) -> select fn(x) as y
that then breaks down because it can't construct ??? -> select extract(month from x) as y
and that is the only syntax the SQL database decided to understand. There are too many cases like that that need special handling, especially once SQL dialect-specific stuff comes into play. comp = salary / (age - 18)
is (strictly speaking) not part of the relational algebra because it is not using set operations (like join or union). It was added to the relational model because we hardly can process data without such expressions.A better way to formally describe such calculated columns is to introduce functions and treat them as first class elements of the data model. In particular, function operations are better than set operations in these cases [1]:
- Calculating data using calculated columns. We describe a calculated column as a function without generating new relations (as opposed to select-from)
- Aggregating data. We can add an aggregate column without generating new relations (as opposed to groupby)
- Linking data. We can add derived link columns without generating new relations (as opposed to join)
This function-based approach to data modeling and data processing was implemented in Prosto [2] which demonstrates how many typical relational tasks can be solved using functions.
[1] Why functions and column-orientation? https://prosto.readthedocs.io/en/latest/text/why.html
[2] Functions matter! No join-groupby, No map-reduce. https://github.com/prostodata/prosto
Computing is inherently tied to math. There’s no getting away from that.
(and just to be clear, I'm neither for or against)
QUEL:
range of e is employee
retrieve into w (comp = e.salary / (e.age - 18))
where e.name = "Jones"
SQL: select
(E.Salary / (E.Age - 18)) as COMP
from EMPLOYEE as E
where E.Name = "Jones"
Now QUEL looks like modern language while SQL is jarred mess.> To store the results of the retrieve in a new table, specify `into tablename` [1].
matching SQL:
create table w as
select (e.salary / (e.age - 18)) as comp
from employee as e
where e.name = "jones"
[1] http://docs.huihoo.com/ingres/9.3/QUELRef.pdfI had:
a = employee(name == "Jones") a has name, salary and age
w = a[comp = salary / (age - 18)] w has comp
You want: b = employee[name, comp = salary / (age - 18)] b has name and comp
w = b(name == "Jones" && comp > 500) w has name and comp
In one line: w = employee[name, comp = salary / (age - 18)](name == "Jones" && comp > 500) w = employee(name == "Jones")[comp = salary / (age - 18)](comp > 2000)Same problem with declaring imports in javascript. Python got it correct, where I write e.g. 'from somemodule import ...', so by the time I get to the import the parser has enough info to help with autocomplete. Javascript's 'import { Something } from "foo"' means the parser can't help me autocomplete Something.
But, thanks to https://github.com/haskell-beam/beam I no longer need to worry about the COBOL-ness of SQL without giving up the semantics. Finally!
Edit: In case you actually do want to play with it, Ingres is now Actian IngresX. Though I'd recommend thinking about what could have been instead of actually spending any time fighting with and configuring Ingres.
QUEL is still there, but its functionality has been frozen for years and, in terms of bells and whistles, is way behind modern SQL.
If you do want to try it, I had no trouble installing it on my home Ubuntu system a couple of years ago. Ingres is pretty easy to use in some ways. If you want to make a database, you just type "createdb mydatabase" at the command line.
My previous employer used Versant by Actian (now called Actian NoSQL) heavily.
It's much more of a NoSQL database: it's a real object oriented database. You don't store tuples, you store objects. You don't make queries with selection and projection, you make a cut of a graph of objects.
It's insanely fast, multithreaded, has very good tooling, scales vertically very well, can do online schema evolution (class definition evolution, really).
Sadly it's almost impossible to scale horizontally (I'd be glad to be proven wrong).
It's basically what the industry needs to avoid the object-relational mismatch: an object oriented database.
But everybody only learns SQL...
As far I can tell it supports Quel.
Once again, California's absence of noncompetes plays a critical role in the success of a business.
As I said the last time this came around: The end of this isn't quite right. The Postgres project started in 1986. I don't recall what language it used, QUEL perhaps, but it wasn't SQL. SQL support was added between 1994 and 1996, and that's when PostgreSQL was born.
But the ending as it is written seems correct to me.
Some might call the behavior principled, rather than arrogant.
"In this paper, we started with the observation that Datalog engines do not translate across domains. We experimentally evaluated the advantages and disadvantages of existing techniques, and compared them with our own baseline, a general-purpose, parallel, in-memory Datalog solver (RecStep) built upon a rdbms.
"We presented the necessary optimizations and guidelines to achieve efficiency, and demonstrated that RecStep is scalable, applicable to a range of application domains, and is competitive with highly op- timized and specialized Datalog solvers."
vldb.org/pvldb/vol12/p695-fan.pdf (2019)
https://engineering.linkedin.com/blog/2020/liquid-the-soul-o...
https://engineering.linkedin.com/blog/2020/liquid--the-soul-...
People do not know the difference between popularity and merit. They don't realize that the reason we do things is because that is how we do things. And they put a lot of effort into rationalizing the way we do things, without realizing the subconscious psychological (rather than rational) basis for that.
Most people fundamentally are generally unable to question assumptions about technology or how the world works.
This is one reason why, even though I am rooting for the human species, I am doubtful we will be able to stay relevant for long as autonomous general machine intelligence is built and deployed.
Even with extensive augmentation, it's obvious that there are just severe limitations to the human mind.
The nice thing is that we have the opportunity to design successors that will not be limited in so many ways.
https://github.com/Tablam/TablaM
It have ideas similar to QUEL...
https://github.com/prostodata/prosto - Functions matter! No join-groupby, No map-reduce
DBAs of the day would spend a long time optimising physical storage of tables and building aggregation tables to make up for this performance deficit.
MySQL sort of solves this problem with a <=> operator, which I wish was the default for ORMs to use.
There are a lot of other minor nitpicks but a lot of criticisms come down to the actual RDMS not SQL itself.
SQL’s shortcomings can be grouped into these categories:
- lack of proper orthogonality — SQL is hard to compose;
- lack of compactness — SQL is a large language;
- lack of consistency — SQL is inconsistent in syntax and semantics;
- poor system cohesion — SQL does not integrate well enough with application languages and protocols.
[1] We Can Do Better Than SQL: https://www.edgedb.com/blog/we-can-do-better-than-sql/
SELECT AVG(col1) OVER (PARTITION BY col2) AS partcol1
FROM tbl
WHERE AVG(col1) OVER (PARTITION BY col2) > 10.0
I wish I could just do: SELECT AVG(col1) OVER (PARTITION BY col2) AS partcol1
FROM tbl
WHERE partcol1 > 10.0
But I can't, because the WHERE clause is processed before the SELECT clause.So if I have a bunch of these and want a convenient way to work with the named columns, I have to wrap them into a common table expression:
WITH cte AS (
SELECT AVG(col1) OVER (PARTITION BY col2) AS partcol1
FROM tbl
)
SELECT partcol1
FROM cte
WHERE partcol1 > 10.0
So wordy.For example, your example would have been more naturally expressed as
SELECT partcol1
FROM (tbl WITH partcol1 AS AVG(col1) OVER (PARTITION BY col2))
WHERE partcol1 > 10.0WHERE col1 > col2
is wrong, and it'll break in terrible ways
and in the face of negation + NULLs, everything falls apart[0], giving you both false positive and false negatives in your answerset, and everything will look perfectly fine.
Every column being made NOT NULL is the only sane solution.
[0] https://databasetheory.org/sites/default/files/2018-05/03_pr...
That implies that comparison and boolean operators should be ternary, which would be ugly and confusing: but an ugly and confusing that reflects reality.
This would give us three "greater than" operators:
WHERE col1 > col2
WHERE col1 ?> col2
WHERE col1 >? col2
The first is always false for NULL, the second always picks the NULL column, the third never picks the NULL column. It doesn't seem coherent to order two NULLs, so that would always be false. I'm not attached to the syntax, which is intended to be illustrative.The most important of such operators would be
WHERE col1 ?= col2
which coerces two NULLs to be equivalent. It says "yes NULLs aren't comparable, but for this query, I want to treat them as equal". Because that is only usually true, not invariably so. FROMCLAUSE * SELECTCLAUSE * WHERECLAUSE = SQLEXPRESSION
SELECTCLAUSE * WHERECLAUSE * FROMCLAUSE = SQLEXPRESSION
...
The issue is that not only does SQL syntax force an artificial order on these clauses, but that these clauses Cannot be decomposed to be used elsewhere. I cannot reuse a WHERECLAUSE or a SELECTCLAUSE in another expression.Example:
select * from customers c inner join orders o on o.order_id = c.customer_id
Legal syntax but clearly incorrect.
[1] https://books.google.co.uk/books?id=t9ZQAAAAMAAJ&source=gbs_...
I agree, and I do wish SQL had stronger typing so the parser could warn you before your query silently runs off the rails. For example, in Oracle, I believe the following is legal, but I wish it wasn’t:
select *
from my tab
where 1 = ‘1’If we had a more composable query language being used instead of SQL, I wonder if that would have effected the course of ORMs, which arguably end up being as much about composable models of queries as they do about actual object mapping.
And thus PostgreSQL was born."
And 35 years later PostgreSQL is kicking Oracle's butt at every corner.
The best thing about English is the lack of accent marks. Makes each glyph unique (other than casing). Sorts are faster and not ambiguous.
It's good, for some purposes, that English can be written conventionally by ignoring the accent in words such as résumé. But there are plenty of contexts, and I would say most of them, where this is going to bite you.
Bad metaphor: There is no evidence for Dvorak's technical superiority to QWERTY, neither is there for Esperanto over other languages.
Esperanto isn't intended to be superior. It's value is on it being equally foreign yet approachable for all the salient parties and therefore a conceivable acceptable neutral turf for everyone to share.
Ironically, the case for Dvorak keyboards is kind of the opposite: QWERTY was intentionally designed to avoid jams, which if anything biased towards making it more difficult.
It's not so much technical superiority as having a design objective that is more appropriate for the problem space.
One can similarly argue about whether "QUEL" is really technically superior to "SQL", but the design objective is (at least as perceived by the author) better aligned with the solution space.
Minor quibble that this is not true (or at least strongly disputed).
https://www.smithsonianmag.com/arts-culture/fact-of-fiction-...
This was not even attempted; Esperanto is a Romance language. Unless you think the only salient parties are Spain, France, Portugal, Italy, and Latin America, this "value" does not exist and was not a goal.
Outside Western Europe, Esperanto makes rather little sense. It's highly regular, which is nice — but, say, Japanese is also highly regular.
Today we can be tempted to frame it as european chauvinism, but that's just because of our expanded horizons as a global society. People who believes in esperanto in the early days would likely share the same feeling now
QWERTY isn't as bad as people make it out to be, and Dvorak isn't as good as people make it out to be.
There are keyboard configurations that are better than both, but nobody uses them because you'll never be able to use anybody else's keyboard.
This is a huge issue with programming in general not exclusive to SQL. Everyone would like to build programs that are modular and reusable but programming paradigms have been traveling in directions that prevent this from happening. Many people turn to design patterns or microservices to try to deal with this organizational issue but they fail to see that the lower level programming paradigm itself is the precursor to the problem.
In SQL the problem occurs in the statement itself. The WHERE clause or the SELECT clause cannot be reused anywhere else. I can't modularize a where clause and put it in another SQL statement. I have to rewrite the entire clause to reuse it.
In OOP the same issue occurs. In OOP your class tends to contain methods that are not combinators, or in other words methods that modify a free variable. Due to this like the SQL expression, Objects cannot be decomposed either. I cannot reuse a setter in another class or anywhere else outside of the context of the free variable it modifies.
In both cases there comes a time in the future of an application where programmers realize that similar logic could be reused but structural problems are preventing the reuse from happening so they have to implement a hack to get around it.
The issue is that everyone is unaware of this trend at a low level. They are unaware that SQL lacks composability just like how they are unaware that OOP lacks composability as well. But they are aware of this issue at a higher level and they tend to call it "technical debt" or some high level design problem.
Case in point: https://news.ycombinator.com/item?id=24732789
Most commenters above talk about minor syntactical issues and fail to address what is not only IMO the main issue, but the main issue that the article itself is addressing. Likely because they're all unaware of the true nature of the composability issue and just didn't completely understand what the article was saying.
Also note that when I talk about composition in OOP I am not talking about "object composition." These are completely different usages of the word.
I want modularity in programs because they are large, and without proper abstraction, impossible to manage.
SQL queries are generally really short. Rarely more than a few lines (might be different for people doing ad-hoc analysis instead of making a db backed application). I don't need modularity for a program that is only a few lines long.
The lack of really natural integration into modern day programming languages (and data model mismatches) is a much bigger issue imo
The SQL data types consist of a few primitives like ints strings and chars placed in higher order data types that are tables. These types are easily isomorphic to data structures and primitive types in traditional programming languages. There is zero mismatch here, in fact the data structures in application programming languages tend to be much richer than SQL.
See orms for reference. ORMS have their own set of problems but data mismatches are not part of those problems. If the programming language enables the creation of sum types like rust or haskell than there can never be a mismatch as these languages can produce virtually any type.
>SQL queries are generally really short. Rarely more than a few lines (might be different for people doing ad-hoc analysis instead of making a db backed application). I don't need modularity for a program that is only a few lines long.
For complex applications this is not true. In general the classic model for web development is to place as much logic as possible into the SQL query and as little logic as possible into your heavy Python web app. The web app is suppose to serve as something that routes IO the bulk of your code/logic and heavy lifting should be shifted to the database. Simple apps can avoid this but in general complex apps cannot.
Case in point do you put your where clause in the application program than download the entire table? Or do you throw as much logic as possible into the query so the database outputs a result as close as possible to what you need? The later statement is the right answer.
>I don't really feel like composability/modularity is all that important in SQL.
You're not completely wrong. The bigger issue is SQL optimization. Because Databases form the backbone of computation for a standard web app SQL provides a layer of indirection that makes optimization harder. For C++ optimization is built into the syntax itself, you make choices while coding to optimize things. For SQL you open up the black box and look into the query planner to see what your High level code is compiling too. SQL is a really bad interface for doing optimizations but that's a topic for another day. The topic of this post is modularity and he's not wrong... SQL is not a composable language and there's no performance loss in making it more composeable.
I have a lot of "where my_func(x, y, z) = 1" type where clauses, so seem that would do what you say, no?
Not even sure if stored procedures are part of the sql standard.... these seem to me to be just specific syntax additions added on by specific databases.
I mean it works so why not. I could code all my SQL this way.
SQL.
There is no reason you couldn't have a composable implementaiton of relational logic.
> The essence of the relational model is that names are known up front
That every relation consist of tuples each member of which consists of a name, a type, and a value is part of the model, sure. That doesn't impose restrictions on composability.
> an attribute of one relation is not an attribute of another.
An attribute of one relation may well be an attribute of another.
There's no reason you couldn't have a relational language that let you store and reuse clauses.
So for example the union operator can take two different relations that have the same columns but different rows and create a new relation.
In this binary operation that takes two different relations.... the attributes of one relation (the columns) is indeed the attributes of the other relation.
See here: https://www.geeksforgeeks.org/basic-operators-in-relational-...
Look for "union" in the link above.
>Union on two relations R1 and R2 can only be computed if R1 and R2 are union compatible (These two relation should have same number of attributes and corresponding attributes in two relations have same domain)
It's the same thing for unary operations like select. A select expression can operate on different relations providing that the relations have the relevant attributes.