It's not you, it's SQL (opens in new tab)

(stack.convex.dev)

61 pointstristanz3y ago83 comments

83 comments

61 comments · 19 top-level

Justsignedup3y ago· 8 in thread

My biggest issue with sql has and always will be the lack of definitions.

There's no way to express data structure knowledge in sql, only relations and keys. Discoverability can be quite lacking. And every time I want to join table A to B I have to re-define everything, because SQL doesn't store that.

ORMs help. They help because they encode relationships in meaningful ways. A has many Bs, so A.B works, and I don't need to repeat this join logic every damn time. But ORMs have down sides too. Sometimes they generate queries that are really sub-optimal.

I think my favorite ORM usage was with Hibernate. I KNOW!!! THAT DEVIL! But honestly we wrote SQL in hibernate, and then invoked it to populate our data. Yes it was a bit more work than RoR's "order.items.where("price > 40")" but when complex things happened, it was always easier.

I've always looked at nosql as "absolutely, sounds great, how do you represent relationships?"

cfeduke3y ago

You can sort of fake the joins between two tables and avoid the ceremony with `natural join` assuming you've named the columns the same. (This doesn't work if your team has followed a normal pattern of "table1.id" with a FK in "table2.table1_id".) It's more like a hack and it falls apart as soon as someone starts mucking with column names [what evil person would ever do a thing like that...] or happens to share unrelated columns with the same name across two tables. So yeah, probably don't use it, except maybe to save yourself some time writing OLAP queries but increase your time spent debugging those same queries. :)

dragonwriter3y ago

> There’s no way to express data structure knowledge in sql, only relations and keys.

Relations, keys, and (though you forgot them) constraints express…quite a lot of data structure knowledge.

> And every time I want to join table A to B I have to re-define everything, because SQL doesn’t store that.

SQL stores that if you tell it to; the usual way being view CREATE VIEW.

richiebful13y ago

A database system could implement a feature that automatically uses the foreign key to join to a table. Maybe some RDBMS out there does this.

For example, you have a many to one relationship between posts and users. Instead of this:

  select *
  from user as u
  join post as p
  on p.user_id = u.user_id;

You could do:

  alter table post
  add constraint fk_user_id foreign key (user_id) 
  references user.user_id;

  select *
  from user as u
  left referent join post as p;

Any good sql auto-completer will also look up the foreign key information and auto-generate the on clause for you as well. Redgate SQL Prompt (mssql only) is one of the best tools out there for this reason

sgk2843y ago

SQL already covers this use case with the `using` keyword. But you need to specify the shared column name. If you didn't need to specify the column then adding a new foreign key between the tables would make existing queries ambiguous and break backwards compatibility. See:

https://www.postgresql.org/docs/current/queries-table-expres....

2 more replies

civilized3y ago

Hi, just curious - could you or someone else be more specific about the way in which an ORM encodes relationships that SQL doesn't?

jrumbut3y ago

An ORM allows you to encode a relationship once then use it for many different queries, I think that's the idea.

1 more reply

MaxBarraclough3y ago

> Discoverability can be quite lacking.

A richer type system would help there, but to my knowledge this isn't offered by any major relational DBMS.

isoprophlex3y ago

https://www.postgresql.org/docs/current/sql-createtype.html

Postgres allows you to define custom types at least; making some enums or composites might give you some measure of sanity..?

1 more reply

richbell3y ago· 8 in thread

> Tab! Tab! Tab! PostgreSQL demonstrates its field autocomplete feature.

SELECT preceeding FROM is such a thorn in the side. :(

paulfitz3y ago

Agree, I started paying attention to duckdb when it started allowing FROM before SELECT https://github.com/duckdb/duckdb/pull/5076

mjevans3y ago

I appreciate the command first syntax, but maybe something like

; SELECT FROM sources ... WHERE / ORDER / ETC ... DATA fields ...

What if DATA was a reserved keyword for column identifiers and other data fields that would normally be immediately after SELECT so they could appear anywhere in the query syntax?

BitwiseFool3y ago

I wish SQL did not require a comma between items after the SELECT and before the FROM. We don't need commas in between joins. I feel like someone could write a way to parse queries so that it isn't needed.

Can you imagine how much time and effort that would save people?

Edit: You do need commas in ORDER BYs, that slipped my mind when typing out this pet peeve of mine.

jemmyw3y ago

I'd be happy if it just accepted a dangling comma so you can swap your columns around freely without causing a syntax error

1 more reply

irrational3y ago

But, what would the token be? Where clauses are separated by AND and OR. Order by are separated by commas. Joins are separated by JOIN, INNER JOIN, AND, etc. The parser knows where the WHERE clauses are because they see the WHERE token.

dragonwriter3y ago

> I wish SQL did not require a comma between items after the SELECT and before the FROM.

> Can you imagine how much time and effort that would save people?

I can imagine how much extra time I’d spend typing “AS” in queries, which would ne necessary for column aliases to be distinct from new select items. And how much more time I’d spend reading unreadable queries in contexts like logs where they aren’t pretty-printed. Net, it seems to be a big loss.

> We don’t need commas in between joins.

That’s because each join is introduced by a string of one or more reserved words (including JOIN).

If each select field was prefaces with FIELD you wouldn’t need commas, but...

evanelias3y ago

That wouldn't be possible to parse, because of how column aliasing works: the "AS" keyword is optional. In other words, "SELECT foo something_else" is equivalent to "SELECT foo AS something_else".

Also, I seem to recall that very old versions of Oracle did require commas between joins. (The join conditions had to be stated in the WHERE clause back then, instead of supporting the ON clause, iirc).

marcosdumay3y ago

You can use commas between tables perfectly well. The `join` token just has the same role as the comma, making it redundant.

xpil3y ago· 5 in thread

>> Unlike SELECT, these operations don't feature JOINs or subqueries or any other magic that brings together tables.

This is a false statement. Both INSERT and UPDATE support JOINs and subqueries / CTEs. At least according to the standard - not every engine implementing them is another story.

wvenable3y ago

I'm forgiving on the authors point here. If you have JOINs and subqueries, you're just doing a SELECT to get data that can only be UPDATEd/INSERTed on a single table. You can't do an INSERT across 5 tables in one statement.

mkleczek3y ago

You can in PostgreSQL and it is very useful: https://www.postgresql.org/docs/current/queries-with.html#QU...

jamwt3y ago

Article author here, yep, that was the intended point. Agree the wording was unclear, and there's a footnote now clarifying that. Thanks!

1 more reply

eatonphil3y ago

> not every engine implementing them is another story

Which don't? I'd have assumed anything inside of `SELECT`'s `FROM` would be allowed inside of `INSERT` and `UPDATE`.

Or maybe you're not saying you know there are implementations that have these restrictions just that any random implementation might not be there (yet).

xpil3y ago

> Which don't?

Redshift (a PostgreSQL derivative, more or less) can run an INSERT with a join, but cannot do an UPDATE with a join.

commandlinefan3y ago· 4 in thread

The article presents the move from the old, doddering SQL to the newer, better NoSQL - but the hierarchical database model actually _predate_ SQL. The first databases like IBM's IMS were hierarchical a lot like Mongo is: Codd was actually trying to address the problems with that model when he created the relational model.

jamwt3y ago

Article author, here. Good point -- but the article actually proposes document relational, not hierarchical. Relational is definitely good!

dragonwriter3y ago

It vaguely references it, doesn’t explain what the data model it is proposing actually is in concrete terms. It gives some API examples, but nothing that tells me what is different between it and the data model of any of the major (Object) Relational DBMS’s with an appropriate host-language client adaptor except that the client library exists only for TS and the type system is designed to be 1:1 with TypeScript. And the data model arguments (except for the client-language type equivalency for TypeScript) are all exactly the classic arguments for ORDBMS’s over classic RDBMS’s, which I guess “won” to the extent that many major DBMS are ORDBMS’s, but kind of lost in that almost everyone, even using them from an OO host language with the kind of deeply OO code that was in fashion a decade or two ago but is falling out of fashion these days, uses mostly just the RDBMS features, and it works pretty well because of the universality of the relational model.

Really, the big thing that I see here is the transaction retry logic, which (if the writes in your transaction depend only on the reads on not side effects, which you’d need to declare) would be a nice feature for any db engine, and (if the side effects are packaged together in a safe way with the relevant reads and writes, which can be more general) would also be a nice feature for a high-level db client library, independent of backend. But that’s a pretty thin reed to hang a dependency on a proprietary SaaS solution over, say, Postgres on.

1 more reply

jd_mongodb3y ago

Comparing any modern database (i.e. developed after 1980) with IMS is like comparing a flint knife with a Barrett .50 calibre.

commandlinefan3y ago

Which one is the flint knife and which one is the Barrett .50 calibre?

schmichael3y ago· 3 in thread

Wow, this is written by an early mentor of mine who introduced me to CouchDB and MongoDB circa 2008! I went on to take MongoDB to Urban Airship (now Airship), make a complete mess out of things (lots of evidence of that on HN even), and eventually port it all to a Postgres cluster. Meanwhile jamwt went to Dropbox and ended up managing thousands of SQL instances of his own!

And now both of us are back on non-SQL datastores: Convex for him, and Nomad's combination of Raft+MemDB for me. While Convex sounds influenced by transactional memory (define your critical section and let the platform keep trying to apply it until it succeeds), Nomad opted for Raft's serialization of mutations. On the read side we both opted for explicit index use, so it seems fair to assume both of us feel like SQL's decoupling of indexes from queries does more harm than good. Query planners are great until they're not.

I would love a SQL (well SELECT) interface to Nomad's datastore (MemDB). SELECT is like the C ABI for data instead of code: imperfect, but a lowest common denominator that if implemented unlocks a ton of workflows.

I wonder if jamwt feels the same and both projects will eventually grow a SQL adapter (or at least a SQL-like DSL... Nomad is close with bexpr filtering on a number of APIs, but JOINs would be really useful).

jamwt3y ago

Hi there Michael! We continue to love nomad.

SQL is the C ABI of querying for sure. BI tools will never adapt to use Convex directly, and nor should they.

So... yes, Convex actually had a prototype SQL adapter for the read side of things back in the early few months when we were figuring things out. Convex's read semantics are very compatible with SQL.

We've kept this adapter on ice in part because of point #3 in the article -- we don't want to overpromise things which are a bad idea.

Meaning, if we exposed SQL on the thing as-is, this would presumably be for more analytical type queries involving patterns normal Convex queries can't express. Right now that would be a Bad Idea because your website would slow down just like every other database system allows you to.

So the current recommended practice is use our Airbyte Egress connector (https://airbyte.com/connectors/convex) and get yourself into an offline Clickhouse/MySQL/Snowflake whatever and jam SQL over there to your heart's content. That's basically what we do.

We may one day abstract this away by embedding some sort of mirrored column store sql thing (maybe DuckDB based? who knows) so you can do your analytical work without impact on your OLTP or integrating 3 more systems. But the team hasn't invested in that yet and probably won't for some time.

schmichael3y ago

Hello!

> Convex actually had a prototype SQL adapter

Ha, this sounds like the heaps of scripts we Nomad developers have lying around for converting ad hoc bits of Nomad state to sqlite for debugging particularly stateful problems.

> So the current recommended practice is use our Airbyte Egress connector

Ah this is where we would like to end up with Nomad's event stream API: https://developer.hashicorp.com/nomad/api-docs/events#event-...

Sadly we've been unable to identify that First Sink to implement and open source. It's a lot of work to build and maintain production quality connectors, so it would be nice to do just one like your Airbyte Egress. This is an area where Nomad's onprem nature makes it hard to pick one solution that will work for a wide number of users. Someday!

jamwt3y ago

Also, yes, great pickup re: transactional memory. We talk about this internally all the time, this is the inspiration.

Haskell: IO -> STM -> (pure)

Convex: (The browser / Convex Actions / other effectful environments) -> Mutations -> queries

All the same benefits re: retry and memoization.

Steal steal from Haskell, so many great ideas there.

jrochkind13y ago· 3 in thread

> In Convex, the application and database types are automatically equivalent because the entire data pipeline from frontend to database uses the same exact types and definitions... There is no adapter code between languages because everything we write is just TypeScript.

It may actually be fine to just require everything to be typescript, but the idea that you'd require your application be written in the same language as your data store, and thus implement a different data store for each language you might want to write an app in (and not share between languages)... would formerly be thought of as pretty ridiculous?

But maybe it's not anymore?

But that seems to be the requirement for "the application and database types are automatically equivalent because the entire data pipeline from frontend to database uses the same exact types and definitions," yes?

In general, most of the purported benefits of convex seem to effectively require the db in the same language as the app, to make it all so seamless.

I guess we'll see if Convex can get popular enough to spawn lots of "what I wish I knew about convex a year ago" and "Goodbye Convex, hello Postgresql" posts...

jamwt3y ago

Article author here.

> It may actually be fine to just require everything to be typescript, but the idea that you'd require your application be written in the same language as your data store, and thus implement a different data store for each language you might want to write an app in (and not share between languages)... would formerly be thought of as pretty ridiculous?

The vision is definitely aspirational, and is reflecting on the fact that JS/TS is becoming overwhelmingly the most popular language for projects. With things like React Native, for many teams even mobile apps are taken care of all on one language.

There will obviously always be apps written in other languages, and in those cases, a bit more type mapping will be necessary, because (for now) Convex functions are always side-effect free, automatically cached JS/TS running in a very particular runtime. But we'll work on making that translation as smooth as possible in our client libraries.

We have a Python client library out now ( https://pypi.org/project/convex/ ) and a Rust one coming soon, just to keep us honest in this respect.

> I guess we'll see if Convex can get popular enough to spawn lots of "what I wish I knew about convex a year ago" and "Goodbye Convex, hello Postgresql" posts...

Well, that's the dream, isn't it?

jrochkind13y ago

I see. So, when you say "the application and database types are automatically equivalent because the entire data pipeline from frontend to database uses the same exact types and definitions", can that apply somehow to python and rust client libraries too?

What about "all write operations on the database are done in embedded functions"? For the python and rust client libraries, do you write these embedded functions in python or rust, or typescript/JS? Do you effectively have to write a specialized update procedure in typescript for every update you want to do? This starts seeming harder to spin as an advantage if you aren't typescript in the first place. ("A custom stored procedure required for every update" is actually not something anyone's been looking for!)

The OP arguments seem to me to be written assuming you are writing typescript in your app, it's unclear to me how the advantages you claim translate if you are not -- such that I figured the product was actually only for typescript apps! It would be interesting to see a rewrite of it that explicitly targets those writing in, say, python or rust.

1 more reply

nativecoinc3y ago

Article ostensibly about doing better than SQL: Ah, interesting

Underlying vision or assumption that we’ll all be using JavaScript/TypeScript in any case: Freaking really??

vlunkr3y ago· 3 in thread

I agree with lots of the points here. However one of the great strengths of the SQL family is the FOSS history. I doubt something proprietary is ever going win everyone over. At least I hope not.

schmichael3y ago

SQL took a long winding road to reach the FOSS friendliness it has today. Its history is littered with patent, copyright, and trademark disputes. Not to mention widely varying implementations and the continuing lack of a crossplatform wire protocol or standard interface. Proprietary incumbents still dominate huge sectors of the SQL landscape and pose compatibility and intellectual property issues for FOSS implementations.

I think any SaaS APIs that reach any degree of widespread adoption eventually get OSS implementations (eg S3's API is widely supported by FOSS and proprietary implementations). This evolutionary path is awfully similar to SQL's for better and/or worse.

jamwt3y ago

Article author here.

Agree this is a necessary part of the change. Convex is working on our OSS strategy this year. Thanks for the feedback!

vlunkr3y ago

Awesome! This is a great article. I've been thinking lately about how ORMs are basically a band-aid meant to fix the weaknesses of SQL, so it's great to see people trying to solve the problem from the bottom up.

DerArzt3y ago· 2 in thread

Man what a well written ad.

treis3y ago

My reaction exactly. Specifically when they said what developers want is document DBs. That's not true. We (or a lot of us) want the equivalent of Typescript for SQL. Which people have tried to build but their efforts have ended up like CoffeeScript. Better but ultimately not worth using because they don't have the staying power.

cfeduke3y ago

> equivalent of Typescript for SQL

My limited experience with LINQ to SQL was very positive in these regards - but one is not writing a type safe SQL at all, instead a language that is type safe and is ultimately interpreted as SQL.

Same experience with some libraries in Scala - where it's possible to get compile-time type safety for SQL - but required onerous setup or synchronization/code generation tools to achieve. (Sort of comes with the territory.)

eatonphil3y ago· 2 in thread

On the other side of things, it's interesting to see SQL engines innovating. Though the only one I can name for sure is DuckDB: https://duckdb.org/2022/05/04/friendlier-sql.html.

Have other implementations done anything for making queries easier (that don't involve requiring an IDE or anything, so just query language innovations)?

Edit: Not query language innovations but ClickHouse and Snowflake's "standard library" of builtin functions just keep getting better and better. Once I saw `parseDateTimeBestEffort` in ClickHouse I wanted that everywhere...

jamwt3y ago

Article author here -- DuckDB is indeed cool.

Another interesting bit of work in this space is LINQ on the microsoft side of things. And even list comprehensions end up having a scanning/summing language kind of feel to them that would be interesting to see translated into a new database query language.

znaimon3y ago

+1 to ClickHouse builtin functions—they vastly simplify a lot of data analytics workloads

015a3y ago· 2 in thread

I mean... yeah MongoDB got a lot of hate, but I think the broader point is that it was one of the first technologies to popularize the domain of NoSQL. No one knew how to use it properly; so we adapted SQL-like schema design, and when it became obvious that didn't work well the hate started spilling over to the first technology to arrive at the party.

The elephant in the room is, I suppose, that the modern internet literally would not be possible without NoSQL. It may be possible without SQL; that seems likely to me. Part of that is because NoSQL is a big umbrella, and covers extremely critical databases like Redis or even databases like Cloudflare's proprietary edge cache. But, even document stores are extremely critical to enterprise scaling; during Prime Day 2022, DynamoDB peaked at 150M rps. There's no SQL setup on the planet that could handle volume like that while still maintaining all the things that Make It SQL; you could start throwing read replicas and removing joins and cross-table lookups and sharding data and ope, you just re-invented a document store.

Here's the couple conclusions I have started operating by:

1. Document stores are, today, a strong choice at both low and high scales on the spectrum of systems scaling. Its great at low scales because you can avoid thinking about it too much. Its great at high scales because once you have the space to think about it you can attain substantially higher efficiency (performance+cost).

2. Making a document store operate more like SQL, for the things SQL is good at (joins, validation, etc) is a lot easier than making a SQL database operate like a document store for the things document stores are good at (Planetscale? there's a couple players in this game).

3. SQL-the-language sucks. There I said it; I'll die on that hill. The language was invented at a time +/- 2 years of Scheme, ML, Prolog, and Smalltalk. Our industry has rejected all of those. We haven't rejected SQL (yet). SQL is demonstrably, as a syntax, just as bad as those languages, evidenced by all the ORMs, injection attacks, etc. Databases tend to have a longer shelf life than programming languages, but SQL-the-language will die.

4. But, duh, SQL is fine. Go ahead and use it. In some situations it makes sense. In others it doesn't. Its a tool; one that has had 60 years to form around solving as many problems as possible.

[1] https://aws.amazon.com/blogs/aws/amazon-prime-day-2022-aws-f...

brightball3y ago

> 2. Making a document store operate more like SQL, for the things SQL is good at (joins, validation, etc) is a lot easier than making a SQL database operate like a document store for the things document stores are good at (Planetscale? there's a couple players in this game).

For a lot of NoSQL document stores, the simple acting of having multiple indexes was a bridge too far. Couchbase comes to mind here.

I can't think of many use cases for a document store over PostgreSQL. Maybe as a cache layer...but I'd just use a materialized view. Where some of the data structure was going to be unknown or user defined, such as a system monitoring tool? Server Density was one of the first big MongoDB adopters for this reason. Now we can store than in a JSONB column though.

> 3. SQL-the-language sucks.

You're entitled to your opinion. I've been doing this 20 years and basic SQL knowledge has been by far the most valuable and portable skill of my career. I'm by no means an expert, but you asked developers who have been in the field for 5+ years what a HAVING statement does and they have no idea. There's a huge gap in basic SQL knowledge, stuff that can be learned in < 1 week. I taught somebody who came out of a code boot camp basic SQL for a couple of weeks during an internship. Her first job made her acting DBA because she was the only person who knew SQL.

The hill I'll die on, is that ORM's and frameworks are creating an astounding lack of basic SQL knowledge in today's developers which leads to a lot of completely unwarranted griping about SQL. (not directed at you, just in general on this topic)

015a3y ago

But you just asserted the same point I was: No one knows it anymore! That makes the skill valuable on the short-term, but on the long-term you've got the next generation of programmers entering jobs, being given the opportunity to build and change systems, and what are they going to pick? Maybe not SQL.

Its the same argument, just at a grander scale, for the generation before us and the COBOL mainframe programmers of old; database technologies just tend to move slower because data is very sticky.

You might not see the use-cases; but Apple does. Their usage of Cassandra sustains in the millions of QPS across over 300,000 instances. That's a scale SQL can't even begin to dream of. Discord does; they moved to Scylla [2]. Amazon does [3]. Blackrock Aladdin depends on Cassandra, and manages over $21T in assets [4].

Which kind of circles around to the fact that, while you say "a lack of SQL knowledge is what causes unwarranted griping about SQL", sure I can accept that, but I'd conversely argue that a lack of Solving Real Problems is what causes unwarranted griping about NoSQL. Schematization isn't a real problem; we have a dozen ways of solving that at different layers. Imperfect ACID compliance isn't a real problem; its a pseudo-academic expression of the guarantees that SQL offers, which then gets unfairly transplanted onto NoSQL databases as if those issues can't or wouldn't be addressed elsewhere. Even original article statements like data being inherently relational; data isn't anything, its just data, it just appears relational because its how we've been trained to think about data after 60 years of SQL-monoculture. If you start thinking about data end-to-end, how its displayed to users, on pages, components, screens, feeds, it really doesn't look relational at all; life isn't a graph, the graph is an expression of human pattern-matching on things that could reasonably be related, but in reality aren't; life is a series of snapshots. This moment, this page, this feed, this component, in this moment, for this request, for this session; that is the end-to-end view of data; documents, not a web.

But SQL is fine at that! In fact, its a best option at medium scales; the middle of the bell curve where 80% of people live. It may not always be, but its there today. Its that bottom 10% and top 10% where the solutions that work for most don't work as well; and I think NoSQL gets a lot of Majority Hate despite being an extremely strong solution at those scales. Know your tools, and be willing to replace them when the circumstances change.

[1] https://twitter.com/erickramirezau/status/157806381149547724...

[2] https://www.scylladb.com/press-release/discord-chooses-scyll...

[3] https://aws.amazon.com/blogs/aws/amazon-prime-day-2022-aws-f...

[4] https://www.youtube.com/watch?v=322GytEo_fE

1 more reply

kneebonian3y ago· 1 in thread

Holy crap, now I know I've been in this field to long we are right back to the NoSQL vs SQL debate, for the checks notes 3rd time.

Wow.

rjbwork3y ago

Not just that, but presenting hierarchical as if it isn't a pre-SQL thing that Codd's model was attempting to fix and displace (and succeeded beyond his wildest dreams at doing so, I'd guess).

cgh3y ago· 1 in thread

Optimistic concurrency control is mentioned as a feature of Convex, the advertised product. But OCC is a common concern and has been for ages. In Java, the JPA standard specifies the @Version annotation which enables OCC on entity objects and it's implemented by the usual suspects like Hibernate. It's been around for at least a decade so I'm struggling to understand why this is being presented as some big innovative win, unless I'm completely missing something here.

jamwt3y ago

Article author here.

It's not new "in the world", but often new to our users, who understand it less intuitively than pessimistic locking.

Unless they're using Haskell or probably other very exotic environments with strict understanding of mutability and so on, they know Mutexes but not STM or other OCC patterns. So we lean heavily on explaining it. Not because it is original, but because it is unfamiliar.

cube22223y ago

If I understand correctly this is basically trying to solve a very similar set of issues as something like entgo[0] but in a very TypeScript-native way?

Also, regarding the transaction functions, are those run locally, or are they serialized and run remotely on the database server? Both have their caveats.

Number 1 (and that approach to retries) already works just fine with Postgres, you just need a proper library (which wraps your function and retries on transaction isolation errors). But you also have to keep in mind that if you interface with 3rd party services during your transactions, those calls will need to be idempotent as well. This is actually the proper way to do transactions in application code with a SQL database, esp. if you're running with serializable transaction isolation.

Number 2 is very limiting, as you can't have all those third party services and libraries used between different operations in a transaction, which is often (I'd even argue - usually) very useful.

Since you're citing stored procedures as a viable alternative to convex's take, while not listing the above number 1, it sounds like you're doing 2.

Anyhow, good luck!

[0]: https://entgo.io

JohnDeHope3y ago

"It can be done. But can doesn't mean should." I'm getting worn out on this turn of phrase. Yes, can doesn't mean should. But also, couldn't doesn't mean shouldn't. You have to justify "shouldn't". TFA does go on to justify it, which is appreciated.

WkndTriathlete3y ago

Tell us you don't understand the theory or use cases for SQL databases without telling us you don't understand the theory or use cases for SQL databases.

There are indeed times when a simple key-value store is appropriate, but the encoding of hierarchy and the relational calculus are useful for so many more use cases in business applications. SQL as a language is certainly flawed - it would be nice to be able to algebraically/programmatically join select clauses and conditions, for instance, or to perform multi-inserts - but the semantics of the storage and engine are based on some pretty mathematically sound foundations.

You may find the seminal paper by Codd as illuminating as I did if you read it.

xupybd3y ago

One of the best things I've seen that integrates SQL with your code is F#'s type providers. http://fsprojects.github.io/SQLProvider/

It's amazing how it all works. But only when it works. I've found it so flakey I don't use it for production but if the bugs were ironed out it would be amazing.

legerdemain3y ago

  > We paused just long enough to take a sip of our Spicy Maya Mocha from Coupa Cafe.

To be honest, I think almost all drinks at Coupa Cafe are pretty bad.

AnEro3y ago

Graph databases solve most of his issues, outside the complexity of queries.

If we are begging DB engineers for things can we get a graph layer for accessing SQL tables please?

bambax3y ago

This example is really curious:

    BEGIN;
    SELECT post_count, ... from users where ... FOR UPDATE;
    INSERT INTO posts VALUES(...);
    UPDATE users SET post_count = new_post_count WHERE ...;
    COMMIT;

For one, it's unlikely the number of posts per user is so important and so often requested that it needs to be cached in the database itself.

Secondly, why would that value need to be stored with each new insert? Simply insert posts as they come, and calculate metadata about posts at a later time as a batch process, or when some part of the application actually request them.

j / k navigate · click thread line to collapse

83 comments

61 comments · 19 top-level

Justsignedup3y ago· 8 in thread

My biggest issue with sql has and always will be the lack of definitions.

I've always looked at nosql as "absolutely, sounds great, how do you represent relationships?"

cfeduke3y ago

dragonwriter3y ago

> There’s no way to express data structure knowledge in sql, only relations and keys.

Relations, keys, and (though you forgot them) constraints express…quite a lot of data structure knowledge.

> And every time I want to join table A to B I have to re-define everything, because SQL doesn’t store that.

SQL stores that if you tell it to; the usual way being view CREATE VIEW.

richiebful13y ago

A database system could implement a feature that automatically uses the foreign key to join to a table. Maybe some RDBMS out there does this.

For example, you have a many to one relationship between posts and users. Instead of this:

  select *
  from user as u
  join post as p
  on p.user_id = u.user_id;

You could do:

  alter table post
  add constraint fk_user_id foreign key (user_id) 
  references user.user_id;

  select *
  from user as u
  left referent join post as p;

sgk2843y ago

https://www.postgresql.org/docs/current/queries-table-expres....

2 more replies

civilized3y ago

Hi, just curious - could you or someone else be more specific about the way in which an ORM encodes relationships that SQL doesn't?

jrumbut3y ago

An ORM allows you to encode a relationship once then use it for many different queries, I think that's the idea.

1 more reply

MaxBarraclough3y ago

> Discoverability can be quite lacking.

A richer type system would help there, but to my knowledge this isn't offered by any major relational DBMS.

isoprophlex3y ago

https://www.postgresql.org/docs/current/sql-createtype.html

Postgres allows you to define custom types at least; making some enums or composites might give you some measure of sanity..?

1 more reply

richbell3y ago· 8 in thread

> Tab! Tab! Tab! PostgreSQL demonstrates its field autocomplete feature.

SELECT preceeding FROM is such a thorn in the side. :(

paulfitz3y ago

Agree, I started paying attention to duckdb when it started allowing FROM before SELECT https://github.com/duckdb/duckdb/pull/5076

mjevans3y ago

I appreciate the command first syntax, but maybe something like

; SELECT FROM sources ... WHERE / ORDER / ETC ... DATA fields ...

What if DATA was a reserved keyword for column identifiers and other data fields that would normally be immediately after SELECT so they could appear anywhere in the query syntax?

BitwiseFool3y ago

Can you imagine how much time and effort that would save people?

Edit: You do need commas in ORDER BYs, that slipped my mind when typing out this pet peeve of mine.

jemmyw3y ago

I'd be happy if it just accepted a dangling comma so you can swap your columns around freely without causing a syntax error

1 more reply

irrational3y ago

dragonwriter3y ago

> I wish SQL did not require a comma between items after the SELECT and before the FROM.

> Can you imagine how much time and effort that would save people?

> We don’t need commas in between joins.

That’s because each join is introduced by a string of one or more reserved words (including JOIN).

If each select field was prefaces with FIELD you wouldn’t need commas, but...

evanelias3y ago

That wouldn't be possible to parse, because of how column aliasing works: the "AS" keyword is optional. In other words, "SELECT foo something_else" is equivalent to "SELECT foo AS something_else".

marcosdumay3y ago

You can use commas between tables perfectly well. The `join` token just has the same role as the comma, making it redundant.

xpil3y ago· 5 in thread

>> Unlike SELECT, these operations don't feature JOINs or subqueries or any other magic that brings together tables.

This is a false statement. Both INSERT and UPDATE support JOINs and subqueries / CTEs. At least according to the standard - not every engine implementing them is another story.

wvenable3y ago

mkleczek3y ago

You can in PostgreSQL and it is very useful: https://www.postgresql.org/docs/current/queries-with.html#QU...

jamwt3y ago

Article author here, yep, that was the intended point. Agree the wording was unclear, and there's a footnote now clarifying that. Thanks!

1 more reply

eatonphil3y ago

> not every engine implementing them is another story

Which don't? I'd have assumed anything inside of `SELECT`'s `FROM` would be allowed inside of `INSERT` and `UPDATE`.

Or maybe you're not saying you know there are implementations that have these restrictions just that any random implementation might not be there (yet).

xpil3y ago

> Which don't?

Redshift (a PostgreSQL derivative, more or less) can run an INSERT with a join, but cannot do an UPDATE with a join.

commandlinefan3y ago· 4 in thread

jamwt3y ago

Article author, here. Good point -- but the article actually proposes document relational, not hierarchical. Relational is definitely good!

dragonwriter3y ago

1 more reply

jd_mongodb3y ago

Comparing any modern database (i.e. developed after 1980) with IMS is like comparing a flint knife with a Barrett .50 calibre.

commandlinefan3y ago

Which one is the flint knife and which one is the Barrett .50 calibre?

schmichael3y ago· 3 in thread

jamwt3y ago

Hi there Michael! We continue to love nomad.

SQL is the C ABI of querying for sure. BI tools will never adapt to use Convex directly, and nor should they.

So... yes, Convex actually had a prototype SQL adapter for the read side of things back in the early few months when we were figuring things out. Convex's read semantics are very compatible with SQL.

We've kept this adapter on ice in part because of point #3 in the article -- we don't want to overpromise things which are a bad idea.

schmichael3y ago

Hello!

> Convex actually had a prototype SQL adapter

Ha, this sounds like the heaps of scripts we Nomad developers have lying around for converting ad hoc bits of Nomad state to sqlite for debugging particularly stateful problems.

> So the current recommended practice is use our Airbyte Egress connector

Ah this is where we would like to end up with Nomad's event stream API: https://developer.hashicorp.com/nomad/api-docs/events#event-...

jamwt3y ago

Also, yes, great pickup re: transactional memory. We talk about this internally all the time, this is the inspiration.

Haskell: IO -> STM -> (pure)

Convex: (The browser / Convex Actions / other effectful environments) -> Mutations -> queries

All the same benefits re: retry and memoization.

Steal steal from Haskell, so many great ideas there.

jrochkind13y ago· 3 in thread

But maybe it's not anymore?

In general, most of the purported benefits of convex seem to effectively require the db in the same language as the app, to make it all so seamless.

I guess we'll see if Convex can get popular enough to spawn lots of "what I wish I knew about convex a year ago" and "Goodbye Convex, hello Postgresql" posts...

jamwt3y ago

Article author here.

We have a Python client library out now ( https://pypi.org/project/convex/ ) and a Rust one coming soon, just to keep us honest in this respect.

> I guess we'll see if Convex can get popular enough to spawn lots of "what I wish I knew about convex a year ago" and "Goodbye Convex, hello Postgresql" posts...

Well, that's the dream, isn't it?

jrochkind13y ago

1 more reply

nativecoinc3y ago

Article ostensibly about doing better than SQL: Ah, interesting

Underlying vision or assumption that we’ll all be using JavaScript/TypeScript in any case: Freaking really??

vlunkr3y ago· 3 in thread

I agree with lots of the points here. However one of the great strengths of the SQL family is the FOSS history. I doubt something proprietary is ever going win everyone over. At least I hope not.

schmichael3y ago

jamwt3y ago

Article author here.

Agree this is a necessary part of the change. Convex is working on our OSS strategy this year. Thanks for the feedback!

vlunkr3y ago

DerArzt3y ago· 2 in thread

Man what a well written ad.

treis3y ago

cfeduke3y ago

> equivalent of Typescript for SQL

My limited experience with LINQ to SQL was very positive in these regards - but one is not writing a type safe SQL at all, instead a language that is type safe and is ultimately interpreted as SQL.

eatonphil3y ago· 2 in thread

On the other side of things, it's interesting to see SQL engines innovating. Though the only one I can name for sure is DuckDB: https://duckdb.org/2022/05/04/friendlier-sql.html.

Have other implementations done anything for making queries easier (that don't involve requiring an IDE or anything, so just query language innovations)?

jamwt3y ago

Article author here -- DuckDB is indeed cool.

znaimon3y ago

+1 to ClickHouse builtin functions—they vastly simplify a lot of data analytics workloads

015a3y ago· 2 in thread

Here's the couple conclusions I have started operating by:

4. But, duh, SQL is fine. Go ahead and use it. In some situations it makes sense. In others it doesn't. Its a tool; one that has had 60 years to form around solving as many problems as possible.

[1] https://aws.amazon.com/blogs/aws/amazon-prime-day-2022-aws-f...

brightball3y ago

For a lot of NoSQL document stores, the simple acting of having multiple indexes was a bridge too far. Couchbase comes to mind here.

> 3. SQL-the-language sucks.

015a3y ago

Its the same argument, just at a grander scale, for the generation before us and the COBOL mainframe programmers of old; database technologies just tend to move slower because data is very sticky.

[1] https://twitter.com/erickramirezau/status/157806381149547724...

[2] https://www.scylladb.com/press-release/discord-chooses-scyll...

[3] https://aws.amazon.com/blogs/aws/amazon-prime-day-2022-aws-f...

[4] https://www.youtube.com/watch?v=322GytEo_fE

1 more reply

kneebonian3y ago· 1 in thread

Holy crap, now I know I've been in this field to long we are right back to the NoSQL vs SQL debate, for the checks notes 3rd time.

Wow.

rjbwork3y ago

Not just that, but presenting hierarchical as if it isn't a pre-SQL thing that Codd's model was attempting to fix and displace (and succeeded beyond his wildest dreams at doing so, I'd guess).

cgh3y ago· 1 in thread

jamwt3y ago

Article author here.

It's not new "in the world", but often new to our users, who understand it less intuitively than pessimistic locking.

cube22223y ago

If I understand correctly this is basically trying to solve a very similar set of issues as something like entgo[0] but in a very TypeScript-native way?

Also, regarding the transaction functions, are those run locally, or are they serialized and run remotely on the database server? Both have their caveats.

Number 2 is very limiting, as you can't have all those third party services and libraries used between different operations in a transaction, which is often (I'd even argue - usually) very useful.

Since you're citing stored procedures as a viable alternative to convex's take, while not listing the above number 1, it sounds like you're doing 2.

Anyhow, good luck!

[0]: https://entgo.io

JohnDeHope3y ago

WkndTriathlete3y ago

Tell us you don't understand the theory or use cases for SQL databases without telling us you don't understand the theory or use cases for SQL databases.

You may find the seminal paper by Codd as illuminating as I did if you read it.

xupybd3y ago

One of the best things I've seen that integrates SQL with your code is F#'s type providers. http://fsprojects.github.io/SQLProvider/

It's amazing how it all works. But only when it works. I've found it so flakey I don't use it for production but if the bugs were ironed out it would be amazing.

legerdemain3y ago

  > We paused just long enough to take a sip of our Spicy Maya Mocha from Coupa Cafe.

To be honest, I think almost all drinks at Coupa Cafe are pretty bad.

AnEro3y ago

Graph databases solve most of his issues, outside the complexity of queries.

If we are begging DB engineers for things can we get a graph layer for accessing SQL tables please?

bambax3y ago

This example is really curious:

    BEGIN;
    SELECT post_count, ... from users where ... FOR UPDATE;
    INSERT INTO posts VALUES(...);
    UPDATE users SET post_count = new_post_count WHERE ...;
    COMMIT;

For one, it's unlikely the number of posts per user is so important and so often requested that it needs to be cached in the database itself.

j / k navigate · click thread line to collapse