Databases are the endgame for data-oriented design (opens in new tab)

(spacetimedb.com)

238 pointscloutiertyler2y ago152 comments

152 comments

89 comments · 18 top-level

fifilura2y ago· 21 in thread

My colleagues hate me, but I also found that SQL is The way to write business logic.

Lots of caveats about difficulty to test and weird syntax.

But it is just that SQL is the most terse and standard way so express logic.

And that in itself is the most important factor to avoid bugs. Not what testing strategy you choose.

crazygringo2y ago

> But it is just that SQL is the most terse and standard way so express logic.

As someone who has written a ton of complex SQL... I couldn't disagree more.

Trying to shoehorn things that can trivially and intuitively be expressed in a couple of for-loops with a couple of variables, into SQL expressions making use of joins and aggregate functions and GROUP BY's and (god forbid) correlated subqueries... having to replicate subqueries because their results in one part of the query can't be re-used in another part... teaching people arcane terminology distinctions like between WHERE and HAVING... not having basic functions for calculating percentiles or even a basic median... certain basic kinds of logic operations that simply can't be done... flipping a coin on whether the query engine will use the index and execute in 10 ms, or decide to follow a different execution plan and take 5 minutes to run...

I have never encountered more bugs in business logic than in dealing with SQL. It obfuscates what should be clear. SQL isn't a solution for avoiding bugs, it's what causes so many of them.

chrsig2y ago

The types of issues you're enumerating are really part of a learning curve that every language and environment will have.

> not having basic functions for calculating percentiles or even a basic median

I don't think this is really true anymore; windowing functions are pretty prolific, and I think every major database will have some percentile functions

1 more reply

fifilura2y ago

I am happy for all the disagreeing posts and that HN is a place for it.

It also helps me understand my colleagues.

It may be how our brains are wired. For me whenever I see a for loop I see bugs.

It is also about that you get the concurrency from map/reduce-type jobs for free without having to think about it.

But yeah there are a couple of places you need to be careful. Avoiding duplicates and handling of NULL.

deadbabe2y ago

Are you using Postgres? It seems like you haven’t discovered CTEs or lateral joins.

2 more replies

buryat2y ago

that's how you end up with multi-thousand LOC sql files that contain all the historical business logic and edge cases of a company and it just keeps being piled up on and its complexity keeps growing because not everything is expressible in SQL. So you end up having a big monster that someone needs to support and one that's not very testable easily

ndriscoll2y ago

Testing SQL is easy. E.g. the JVM has H2, which works for simple stuff, or you can use testcontainers, or just spin up a container and run your tests against that. You just run your migrations, insert mock data, and run your test.

In fact testability is one of the best parts. You can safely test read-only queries against a prod secondary database to see that it gives reasonable results on real data, and use the repl to explore parts of your query to really get a sense that things are working.

1 more reply

ttfkam2y ago

You can write code in any language without defining functions or modularizing in any way. You can write bad code in any language.

For some reason, folks assume SQL must be written badly since they have only written it badly or seen it written so.

It is absolutely possible and preferable to write maintainable SQL logic into user defined functions, stored procedures, views, materialized views, CTEs, temporary tables, etc. If you're looking at one huge pile of monolithic, untestable SQL, the problem isn't the SQL.

One doesn't write O(n^3) algorithms in C++ and then blame C++ for it being slow. For some reason, folks seem pleased with themselves to do as such with SQL though every day and twice on Sunday.

Got subselects in each of the fifteen outer joins with NULLs all over your schema, and now you're upset performance is horrible and inconsistent? PEBKAC.

ako2y ago

And in any other language this wouldn’t result in multi-thousand lines of code with historical business logic and edge cases?

2 more replies

cma2562y ago

Glad that doesn't happen any where else!

1 more reply

nextworddev2y ago

What you need is a semantic layer on top of SQL

1 more reply

Etheryte2y ago

I feel like you've come to the right conclusion, but partly for the wrong reasons. Terseness in and of itself is not useful per se. Code golfing languages are the tersest there is, but we don't write code that way.

ndriscoll2y ago

Or they just used the wrong word. I'd agree with them, but clarify that SQL is not just terse, but concise.

A join or a group by is going to be much clearer than writing the code the query plan is going to generate, creating temporary hash maps, doing nested loops, etc.

1 more reply

bob10292y ago

We've been doing this for a while and it has some pretty severe limitations at scale. The biggest issue is that humans are human and you cannot make everyone else on the team use your elegant SQL DSL business engine in exactly the same way you had envisioned. Eventually, it will grow into a bit of a monster.

We are headed back in the other direction, but with some hedge. Code with a very simple & transparent CI/CD experience seems equivalent to some SQL configurable thing in our minds now, but with way more potential upside.

In our latest code-based abstractions, nothing is stopping you from breaking out a SQL connection and running any arbitrary query. In fact, all of the data is still in SQL and it's still authoritative, you just now have access to way more powerful tooling to work with that data. Code+SQL together is the answer.

For non-expert team members, modern C# is turning out to be way more intuitive than SQL the moment you encounter a 3+ table join scenario.

devjab2y ago

I think you make some decent points, but at the same time I really don’t understand this part:

> For non-expert team members, modern C# is turning out to be way more intuitive than SQL the moment you encounter a 3+ table join scenario.

Partly because this is so inefficient that it’s never going to work at scale, but also because of how you’re moving the complexity into the realm of “magic”.

I know it’s very easy and intuitive to use one of C#s ORMs, but with that comes an reliance on things like linq and a model builder, both of which may not work the way you think they do. If your developers think about it at all, and having seen so many C# developers use IEnumerable where they really should have been using IQuerable… well if your developers can’t do relatively basic SQL then I’m not sure I’d trust them with the abstraction either.

I don’t think you’re really, wrong, either, but I think it’s much more a question of building developer tools to help your developers handle good data access than it is about picking a particular tech on the consuming side.

As far as data storage goes, however, I think we’re just beginning to see the move away from the classics because the classics just weren’t build to support how we use data in 2023.

I can’t begin to tell you the nightmare it is to update old school SQL “data wells” to be capable of temporality, good BI access as well as being compliant with the various EU legislative rules, and almost none of that can be done above the database. Well I guess you could, but you’d probably go insane. And that’s just path of it, the other part is just how much data we transact now. It used to be that banks were basically the only organisations to move massive amounts of tiny bits of data, and now we all do it. Like, a single solar plant moves a gazillion points of data into your various systems a minute, where 30 years ago the entire data for that place might’ve been a couple of kb a year, it’s now mbs a day.

hot_gril2y ago

Same, every backend service I write now has the majority of the business logic in SQL and a little pre/post-processing in regular code. A well-designed schema will mean that your queries don't get messy. If some more complex thing starts feeling forced, I add a bit more non-SQL code to make it reasonable.

When teammates look at my code, and logic is all right there instead of scattered around, and there's a schema file backing it all that makes it clear what all the relations are, they have an easy time making tweaks or adding on. Yes it's very testable too.

This also kinda depends on having a multi-service architecture if your system is large. Separate database for each. That's a good thing anyway.

whstl2y ago

The fear of committing SQL to the codebase is one of the most baffling things in modern backends.

To me the most infuriating thing is the "SQL query scattered around multiple files" pattern, where a backend engineer will decompose a perfectly fine SQL query into 3 or 4 files, with multiple functions, often with very artificial separations (for example: a function just for the "select ..." part, another for the joins).

All that in the name of having small files, small functions, small lines. You take complexity away from the "micro" parts and embed it into the invisible parts of your program.

2 more replies

randomdata2y ago

> And that in itself is the most important factor to avoid bugs.

I'm not sure about that. SQL was the first language I learned and the language that has always been there throughout the decades, and is also where I make the most mistakes.

hot_gril2y ago

If you were to write the equivalent of a SQL query in some other language, you'll probably make a lot more mistakes. Especially if you're trying to achieve the same performance. And I mean a read-only query, not even something with multi-writes and locking.

1 more reply

vlovich1232y ago

I’ve heard that datalog (and similar languages in that vein) is far easier and terser to write. What do you think about that?

surprisetalk2y ago

I completely agree!

Transaction-safety is also sublime for business logic.

devoutsalsa2y ago

Have you written an ORM in SQL to help make querying easier?

munificent2y ago· 17 in thread

As an ex-game developer and software architecture nerd, I'm very excited about data-oriented design and ECS. It really is a cool pattern, and it's a very common one in shipping games today. It's not just architecture astronaut stuff.

At the same time, the level of hype about ECS today reminds me an awful lot of the amount of hype surrounding OOP in the 90s. Can ECS be a better way to structure your game entities and make your game loop faster? Yes. Will it make your teeth whiter and your partner love you more? No.

(There are ECS frameworks in JavaScript, which gives you absolutely no control over memory layout and thus completely defeats one of the primary purposes of the pattern.)

Like any pattern, it exists to solve concrete problems. It shouldn't be the One True Way To Think About All Programming Henceforth and Forever.

When the author says things like:

> For example, how would you model chat messages in your game? I suppose you’d have to represent that as an entity in your game. How would you represent a constraint that would prevent a health component from being added to your chat message erroneously? In ECS it’s straightforward to create a system which operates on chat messages individually, but how would you query your chat messages so that you can display them in order?

To me, that just means "Don't use ECS for those." I have a really nice coffee mug that is just perfect for holding coffee. It does its job very well. That does not mean I feel any need to use that coffee mug for digging holes in my garden.

Databases and the relational model are great. ECS is great. Object-oriented programming is great. Functional programming is great. But treat them all as tools that should be used for the right job.

jerf2y ago

"At the same time, the level of hype about ECS today reminds me an awful lot of the amount of hype surrounding OOP in the 90s."

I appreciate that most of the ECS hype has been around specific use cases, though.

OOP was claimed as not a specific useful tool, but the answer to all programming, a billing it has not lived up to. It achieved "useful tool", no question, especially as some of the very rough bits were sanded off the original proposals (particularly the idea that objects should only and exactly match real world objects, an idea which I believe in hindsight was accidentally carried over from a simulation worldview in Simula where maybe it worked into the general programming world where it didn't), but it certainly has failed to be the one true programming methdology.

I haven't seen anyone claiming everything should be rewritten in ECS.

hyperpape2y ago

> I appreciate that most of the ECS hype has been around specific use cases, though.

I recently watched the first 30 minutes of Mike Acton's 2014 talk, and while that portion of the talk wasn't about ECS specifically, it very much presented an absolutist perspective.

1 more reply

munificent2y ago

> I appreciate that most of the ECS hype has been around specific use cases, though.

Depends a lot on where you hang out. On amateur gamedev fora, I have seen many many many posts from beginners where they are struggling to cram ECS into their game and feel they need to because it's simply "the way" that one architects a game. Even if their game is written in a language that offers no performance benefits and the their simulation benefits nothing from it, they just think they have to.

It's heartbreaking watching someone go, "I know I could just store this piece of data right here in my entity class, but I'm not supposed to because of DoD, so how should I do this?" And then they get back confidence answers that involve pages of code and unnecessary systems.

It's exactly like the OOP fad of the 90s, just in the opposite direction. Yes, it turned out you don't need to encapsulate all data in classes. But, also, it is OK to just store data in stuff. You don't have to make every letter of your pop-up dialog a separate component.

hot_gril2y ago

OOP even got mixed with databases to form nasty hype around ORMs and NoSQL.

2 more replies

masfoobar2y ago

"At the same time, the level of hype about ECS today reminds me an awful lot of the amount of hype surrounding OOP in the 90s."

I try not to focus too much on the ECS side. It is all about understand the problem you need to solve. If you are making a game, and you know exactly what the game needs to do from a programming perspective... write it.

Yes, while I would keep the "data oriented" viewpoint... but do you really need to spent time on some ECS layer? If you know exactly what each character in the game does, write it and solve it. Are you making a game... or trying to create the next Unreal Engine?

jalk2y ago

When you have a hammer, everything looks like thumbs

morganherlocker2y ago

> (There are ECS frameworks in JavaScript, which gives you absolutely no control over memory layout and thus completely defeats one of the primary purposes of the pattern.)

While JS does not provide great support for bit packing complex structs, typed arrays give you quite a bit of control over memory layout for simple numeric types, which is what you usually want for optimal data-oriented code anyway. This is a common technique used in fast JS libs for data visualization, ie:

https://github.com/mourner/flatbush

There are also basic operators required for bitarrays, which are useful for ECS and memory-efficient code generally.

meheleventyone2y ago

This. To me a well engineered game from a data perspective is a set of seperate datastores optimized for the job they are doing referencing one another through handles. I can see the drive to represent this in a general purpose way but you nearly always lose performance and/or flexibility. Ironically the “where do I put chat messages in my ECS” example illustrates that nicely.

zengid2y ago

Your chapter on ECS in _Game_Programming_Patterns_ fueled my own obsession with the power of such a design, so you could be partly responsible ;).

I’ve spent a lot of time thinking about how to incorporate Data Oriented Design, SoA and ECS into the normal boring business logic at work, and I think it’s interesting to think about keeping data in the same form as it is in the relational database and skip the impedance mismatch in Object-Relational Mapping.

ECS only makes sense if you have real structs like in .NET or C/C++/Rust/Swift/etc, and tight latency requirements, but I think when someone learns a powerful concept like ECS, they want to invent a reason to use it so they can actually use it in action. I know that is what I’ve experienced at least.

p1necone2y ago

> There are ECS frameworks in JavaScript, which gives you absolutely no control over memory layout and thus completely defeats one of the primary purposes of the pattern.

It doesn't give you direct control over the memory layout, but it's still fairly safe to assume that arrays are going to end up in relatively contiguous memory, which the relevant part for the performance difference between structs of arrays and arrays of structs.

I don't recall explicitly telling many compiled languages to stick all the items in an array together in memory either - it just happens by default, same as in JS.

Rusky2y ago

JS arrays don't store their elements directly, they store references to their elements. (Unless the elements are primitives and the engine is using NaN-boxing or pointer tagging to store those inline, or something like that, which doesn't apply to ECS components generally.)

Depending on how the GC works, the elements themselves might still wind up next to each other in memory some of the time. But that is definitely not a pattern that you would expect to hold in general- components will be added and removed over the course of the game, with lots of other stuff happening in between.

In C++, the layout of the array elements is actually part of the language semantics. In JS, the language semantics don't even have a way to talk about that layout, and engines in practice don't use the layout you want.

munificent2y ago

> it's still fairly safe to assume that arrays are going to end up in relatively contiguous memory

A contiguous array of pointers to the actual objects doesn't buy you much. You're still doing an indirection and risking blowing your cache each time you do something with each element. It may be the case that the objects the array elements point to are contiguous in memory, but that's entirely a roll of the dice, and those dice get re-rolled on every garbage collection.

meheleventyone2y ago

To get this in JS you should do it explicitly with TypedArrays otherwise you’re at risk of what the underlying VM does and in current modern implementations it most assuredly doesn’t make sure most kinds of arrays end up relatively contiguous. Notably for this conversation arrays of objects are unlikely to.

matesz2y ago

I dedicated years to full-time work on unifying the architecture for general software development.

ECS, OOP, functional programming, and others serve as methods to organize software — the developer interface. However that role should belong to IDEs — an actual user interface for developers.

Everybody has different hierarchy of their software, esentially developing their own software engine which sucks, because doing it right is extremely hard and time consuming.

What Steve Jobs told long time ago is still true today, "Paradoxically, we need more sophisticated software to make it more easy for the user" (paraphrasing). This is what we need - trully general software engine with IDE made right, what will unify organization of all sofotware and beyond.

For some deeper insights, in the context of the Web, I suggest reading Graydon Hoare's post [1].

[1] https://types.pl/@graydon/110561036237098295

DrDroop2y ago

What about TypedArrays though, they could be useful to implement a ECS in JS. I do agree with your point, but ECS could be a useful pattern in JS not just for graphics or games but also OLAP workloads.

tlarkworthy2y ago

I don't think memory layout is the sole draw, it's also about incrementally building up programs live.

munificent2y ago

You don't need ECS for that and it doesn't necessarily buy you much. Composition over inheritance can often accomplish the same goal.

1 more reply

hitpointdrew2y ago· 15 in thread

Nope, nope and nope. Went to the github page for spacetimedb, it does everything that is terrible.

>Instead of deploying a web or game server that sits in between your clients and your database, your clients connect directly to the database and execute your application logic inside the database itself. You can write all of your permission and authorization logic right inside your module just as you would in a normal server.

Why? Databases should never, ever, ever, be used to perform logic, they are datastores, that is it. Your logic goes elsewhere. Stored procedures are the worst "feature" of any database, you are just asking for a hard time debugging, troubleshooting, and increasing the chance that you will fuck up the most valuable part of your system, your data.

> This means that you can write your entire application in a single language, Rust, and deploy it as a single binary.

It also means you have a single point of failure, no read-replicas or redundancy. Hate everything about this.

deadbabe2y ago

Can you keep an open mind? We’ve used stored procedures for years. It has worked wonderfully for creating a single source and producer of truth for business data. Instead of potentially having business logic across multiple repos and deployments, everything exists in one place, with absolute unquestionable authority.

It’s not difficult to debug at all, you might just be unskilled.

acdha2y ago

> It’s not difficult to debug at all, you might just be unskilled.

I agree with your larger point but this seems too harsh: it’s definitely harder to debug simply because, as with microservices, understanding how the app is functioning now requires you to understand different code in multiple languages and locations, you’re highly likely to hit non-portable behavior across databases for authoring and debugging, and you’re never going to get a debugger with the whole flow in context.

That doesn’t mean there aren’t benefits as well and it could be especially useful as a way to force distinctions about contracts for common operations, but I wouldn’t say it’s right for all or even most projects. The sweet spot is going to vary widely.

gocartStatue2y ago

The tradeoff seems to be „ability to deploy working software without reliance on single central authority”. You may get rid of several smaller bottlenecks this way, introducing an enormous, all-encompassing one. Or am I wrong?

rgbrgb2y ago

> Databases should never, ever, ever, be used to perform logic

You're talking about a best practice like it's a fundamental law. It's not, it's just how we've mostly been doing things. A lot of interesting innovations in distributed systems / architecture (serverless, graphql, thin clients, thick clients, ORMs, RSC, WSGI, nodejs) have been made because the designers tried relaxing a constraint or taking a counterintuitive idea to a maximalist place.

In fact, if you look harder, there are a fair number of existence proofs of successful systems built on stored procedures. There's even a "best practice" phrase recommending doing compute as close as possible to the data.

1 more reply

coldtea2y ago

Stored procedures ensure all your clients get the same logic. They're only "the worst feature of any database" if the language you're writing them in is not suitable (which is the case with PL/SQL and co, which were tacked on to RDBs and reek of bad 80s syntax and facilities).

If the language is nice and has well designed access to db facilities like records and such, it can be better than writing your code outside the database, especially coupled with a data-oriented design model/ECS (which can be extremely debuggable and offer great visibility).

>and increasing the chance that you will fuck up the most valuable part of your system, your data.

Since you can write anything outside you can inside, no. I can send a "delete from/drop <table>" from any client at any moment, or make any mistake in updating. That's what backups are for, and databases make them even easier (not to mention transactions being very good and neglected part of business logic).

wharvle2y ago

> Why? Databases should never, ever, ever, be used to perform logic, they are datastores, that is it. Your logic goes elsewhere. Stored procedures are the worst "feature" of any database, you are just asking for a hard time debugging, troubleshooting, and increasing the chance that you will fuck up the most valuable part of your system, your data.

You cannot sanely use a database with multiple heterogenous clients without putting logic at least about what valid data should be in it. This is gonna include some “business logic” in practically any real-world system.

Otherwise you have to elevate the same functionality to some gatekeeper-daemon that’ll almost certainly perform far worse, lack features, and be an eternal source of dumb bugs, including, I can just about guarantee, data corruption bugs.

cloutiertylerOP2y ago

This is of course what SpacetimeDB does. The stored procedures fail any transactions that are not authorized to be carried out. e.g. if a player is not high enough level or something.

da_chicken2y ago

> Databases should never, ever, ever, be used to perform logic, they are datastores, that is it.

I wouldn't go that far. Relational algebra is performing logic. Constraints and foreign keys are logic, as well.

I'm not going to argue that you should go back in time 15-20 years and start shredding XML strings in stored procedures again. But thinking of the database as one step above a flat file is similarly backward thinking.

More than that, the concept of putting application logic adjacent to the data store is sound. That's exactly what a web API or a microservice is doing. They allow a uniform mechanism of requests and responses to a data store. At a concept level, that's exactly the same thing. The concept of keeping logic at or immediately adjacent to the data layer so that a whole range of disparate applications can request and maintain data from the source is what the design goal of "database side logic" is.

wvenable2y ago

> At a concept level, that's exactly the same thing.

But at a practical level they very different. If you have a middleware layer as you are describing, it's written in a real programming language with all the adjacent tools (source control, debugging, etc).

I'm not hard-core against stored procedures used lightly but they have a lot of downsides and they simply aren't needed. There's no performance advantage. There are complexity advantages and disadvantages that might be a wash.

3 more replies

Xeamek2y ago

Nothing you mentioned is an inherent problem with the databases. In fact, databases have arguably more potential to solve these problems then any currently used system.

Like debugging - if every memory access is a database access, then you have a builtin logging for all your memory access that is both more performant and optimal then any normal normal debugger. You can take snapshots of your 'memory' pretty much at any time, the data layout is clearly defined. You can manually edit your memory any time you want, and serialization is already solved for you.

The more I think about it, the more possibilities I come up with. Damn, this is actually genius!

whizzter2y ago

I'm writing a quite similar system to this so I can give the differences why this is useful (esp for this usecase).

1: Separating stores was crucial during the 90s and earlier when people were still writing in memory-unsafe languages (C/C++ cough) since it could cause wild corruptions with simple stray pointers. Process-separation was just a sanity thing. As you notice these are other languages in play here so random corruptions shouldn't happen (memory exhaustion can still be a thing though with their model)

2: Yes, debugging stored procedures/triggers on SQL-Server,etc is a PITA because they're database first centric objects, however the idea here is to make the database fill the job of app-servers of the 90s/00s with "regular" debugging workflows for developers. (Don't confuse implementations with the concept)

3: And MOST importantly, this is a game-focused thing, gamedevs will often end up replicating most cache/database functionality (badly) to squeeze things into main memory with the goal to achieve realtime performance targets anyhow, why not forgo that duplicated crazy work with a solid framework?

4: As a corollary to the above, the benefits of separating storage from applications (to run multiple applications against the dataset as it often happens in enterprise scenarios) isn't really the focus, application to database mappings are intended to be more or less 1:1

paulddraper2y ago

> It also means you have a single point of failure, no read-replicas or redundancy. Hate everything about this.

How does writing an application as a single language/binary prevent read-replicas or redundancy?

The application could do that.

Or you could do it at disk level (RAID).

hitpointdrew2y ago

>The application could do that.

How do you manage multiple instances of the application and not introduce split brain?

> Or you could do it at disk level (RAID).

That is not comparable. RAID's redundancy is not the same as the redundancy in a multi node database cluster. You have one service, not multiple services, network card gets fried, your database goes down, you can't promote a standby to master and be on your way. Also RAID is a single disk as far as the OS is concerned, so you could hit I/O limits (especially if you have a single binary) that cause your app to chug, you cannot split your writes and reads across different physical or virtual machines that have different disks.

shepherdjerred2y ago

> you are just asking for a hard time debugging, troubleshooting, and increasing the chance that you will fuck up the most valuable part of your system, your data.

This sounds like a tooling problem. One could imagine a database that doesn't have these issues.

> It also means you have a single point of failure, no read-replicas or redundancy.

What? Why would any of this be the case?

riku_iki2y ago

> Why?

less moving parts, diversity and complexity in your stack (different languages, servers, build/deployment systems, etc).

> you are just asking for a hard time debugging, troubleshooting

what specifically hard about this in your opinion?..

gorgoiler2y ago· 4 in thread

Has anyone here seen a database where your users are users in the db itself? Not just a user(id, name, email, password) table but actual db users with GRANTed permissions and ACLs etc set appropriately, and open access to the DB for these users.

It seems like it would solve a lot of problems by eliminating the need for data broker apps / endpoints that simply put POST or GET parameters into different SQL queries based on which endpoint you hit. I have no idea if it would scale but I don’t see why it couldn’t.

This sort of idea came up earlier this month in another post as well:

https://news.ycombinator.com/item?id=38489307

lgas2y ago

> Has anyone here seen a database where your users are users in the db itself?

PostgREST does this and it was a pretty nice experience when I used it.

https://postgrest.org/en/stable/references/auth.html

hahn-kev2y ago

Yes I have, it was a nightmare. The biggest scaling issue was that on a server you must open a new db connection for each request.

There's also little benefit because you still need to track application user data, so you still need a user table.

sarreph2y ago

Yes, Firebase / Firestore has this capability. You grant access to collections (and even fields) using rulesets[0].

Firestore is a NoSQL db so it's a quite different paradigmatically though.

[0] - https://firebase.google.com/docs/firestore/security/get-star...

cloutiertylerOP2y ago

SpacetimeDB is one such database. It has the concept of an `Identity` which is just a 256 bit unique identifier for a connected user.

paulddraper2y ago· 3 in thread

The relational model is one of those eternal diamonds that will be just as relevant in 2090 as in 1990.

slooonz2y ago

The biggest problem with the relational model is that it’s buried under the dirt that is SQL.

riku_iki2y ago

I kinda think adding some logical/semantic level on top of it (like datalog) will make it even more robust.

vacuity2y ago

Datalog (intentionally) isn't Turing-complete, though. That might hurt applicability for a generalized ECS model.

jitl2y ago· 3 in thread

If the site designers are on here by chance, take a look on iPhone. The blog post is quite squished there, pretty much unreadable.

EDIT: nevermind, fixed after a refresh ¯\_(ツ)_/¯ still maybe worth a look

cloutiertylerOP2y ago

I am on here. Will definitely take a look!

marshray2y ago

While you're here... where's the scrollbar?

I'm on desktop (Edge) and don't see a scrollbar.

Very distracting.

1 more reply

vagrantJin2y ago

another thing. firefox issue.

firefox on android no longer supports mobile view?

1 more reply

rendall2y ago· 2 in thread

Hi spacetimedb.com. You apparently want people to read about your product, but have a non-GDPR-compliant popup that requires people to uncheck multiple "vendors" if your readers do not want to be tracked.

Edit: it actually does not allow any unchecking. Just "you agree to these marketing and tracking cookies by using this site" Nope.

Baffling. It does not lend confidence in your core product. If you're not respectful of your casual reader's data, how can we expect you to be careful with your user's data?

I recommend zero tracking, or if you must, have it be genuinely opt-in.

cloutiertylerOP2y ago

Will fix!

rendall2y ago

7 days later, it's still there, unchanged.

cgh2y ago· 1 in thread

Andrew Kelley of Zig fame did an excellent talk on this very subject: https://vimeo.com/649009599

Strongly recommended viewing if you are interested in understanding why DoD is a big deal for performant applications (hint: it's about the cache).

cloutiertylerOP2y ago

This is an excellent talk for sure.

lijok2y ago· 1 in thread

SpacetimeDB looks great for game dev. I've been hearing chitter chatter in the FaaS world about connecting clients directly to the DB as a means of reducing number of components and complexity in simple-ish CRUD apps. Not sure how well that design holds up there, time will tell.

For those seeing this (SpacetimeDB) and immediatelly conjuring images of nightmares to be, consider the following: if you had an extremely latency sensitive usecase and had the opportunity to host your database and business logic on the same machine, why wouldn't you?

Nezteb2y ago

I agree!

I first learned about SpacetimeDB from a HN thread posted three months ago: https://news.ycombinator.com/item?id=37146952

stuhood2y ago· 1 in thread

The fundamental difference between an ECS and a struct/object layout is that an ECS is column-oriented (aka columnar), while a struct/object layout is row-oriented.

Everything else about how you might query these layouts is more superficial... you can provide the same API with either layout, the same way you can in relational database systems (both layouts can be queried with SQL, but with different performance characteristics.)

gefjon2y ago

Are you familiar with any existing ECS implementations which expose a SQL interface?

masfoobar2y ago· 1 in thread

I started learning programming in the mid-to-late 90's.. a teenager learning with Turbo C, Turbo Pascal, and VB6... eventually to Visual C++... to then attempting to jump on the OOP bandwagon with Java, I began to dislike coding.

I was questioning whether this was the career but, after a few years, decided to give it a go.

Job interviews, particulary then, were about "OOP this" and "OOP that" and I would purposely be agreeable but was unahppy with the code I was writing. Eventually I would come out of my shell with views to other devs.

Roll on to 2014 watching Mike Actons "Data Oriented Design" - and I immediately felt at home, not because I consider myself an 'expert' or on the same league as seasoned game programmers but I felt I had the right mindset all along!

In my 20 years, primarily a C# developer to pay my bills but at home my personal projects are written in C, and tried D.. and now Odin. I just prefer the control they give.

(Scheme as well, I do like Scheme)

Whenever someone asks me (a developer) what Data Oriented Design is, I try to explain to them it is like building a decent database. You are not thinking about the code (so much) but the representation of data. With databases, you create tables that have relations with other tables, with keys and indexes. You are setting up "lookups" etc.

Once they grasp the idea of building a database (which most developers can easily understand) - it is a case of transferring that energy not to tables in a database, but to data structures in your programming language.

Of course, this idea is easier to understand if you have experience in languages like C. For those coming from Python or Java or C#, etc, can be a little trickier but only if they think purely in the OOP mindset.

However, if they struggle to grasp this, then ECS is a great way for understanding in that OOP-ish way. The penny drops when I discuss Entities and Components. You are not building classes and inheritence, you are building entities and components. It is more flexible creating an Entity and "attaching" a Component for it to do something. Then you don't have some kind of Update method in a class, you just have one function which passes in all of it's type.

Anyway - it all comes back full circle now reading this article. Great reading! Now I can just refer people to this link instead.

cloutiertylerOP2y ago

Wow, this really means a lot to me. I'm so glad you enjoyed the article!

samsquire2y ago· 1 in thread

Thank you, I enjoyed this post.

I am thinking that iteration is just traversal and traversal is just execution.

Take iterators in C++ or standard library "algorithms" library in C++. Or iterators in Rust or in Java.

You just want to traverse and collect values (and calls on functions in some interleaving), which is like joining tables as this article says.

I'm thinking that the definition of the traversal (such as Kafka pipeline, clojure threading syntax, Clojure data driven design, Javascript lodash pipeline) can be mapped to tuples and then the computer can optimise arbitrary chains of traversals based on the number of potential tuples are available and what traversals are equivalent paths or routes to the same traversal.

In other words, every program is a compiler pipeline or database query pipeline.

Maybe Prolog and Datalog can help here. Optimisation of arbitrary traversals and determination of identical traversals.

Or monads are just ordered traversals and OCaml compilers are just traversals of execution (function application) and relationship following, which are joins.

The relational model is everywhere.

BoiledCabbage2y ago

> In other words, every program is a compiler pipeline or database query pipeline. > Or monads are just ordered traversals and OCaml compilers are just traversals of execution (function application) and relationship following, which are joins. > The relational model is everywhere.

This is interesting - could you elaborate a bit more?

inopinatus2y ago· 1 in thread

Author feedback: the large print, and the static elements (screen-thieving banner + “AI assistant” button), made attempting to read this blog as presented a vile experience on my phone. The AI button can’t be dismissed, if you ask it how to dismiss itself it doesn't know, and it won’t even accept negative feedback due to obscured submit buttons.

Fortunately there’s Reader mode.

cloutiertylerOP2y ago

Thanks for the feedback, we've definitely got to fix the mobile scaling stuff.

LarsDu882y ago

I think I'm still a bit on the fence after reading this post on whether building out relational queries on top of ECS will ultimately nuke the cache locality and multithreading performance benefits of ECS in the long run.

The great promise of ECS is that once we all have 128-512 core CPUs, that we will actually be making great use of them in the videogames of the future rather than throwing away the vast majority of performance which is the current world with the vast majority of Unity and Unreal games.

Once you hit the real world and start making queries of everything, everywhere, all at once, maybe it'll be lock, lock, lock on your "database queries"

And also, it seem like this DB really demands a game engine like Bevy actually get finished!

zubairq2y ago

I really resonated with this article and ended up at the same conclusion myself, even going so far as to replace a homegrown ECS system with a relational database in the browser

workfromspace2y ago

https://archive.is/8uFyy (original website requires JS)

slifin2y ago

There's a good interview with Nathan Marz on his Rama project interviewed by Juxt

He makes the point that databases are today prescriptive on the structure of data you provide to them

I think databases as we know them today are not yet "end game"

slooonz2y ago

> I think that people perceive databases to be slow, not because they are slow, but because they often interact with them in the context of persisting data to disk, from across a network, by passing strings back and forth which must be parsed, compiled and executed. If you’ve ever optimized a program, you know that I just listed pretty much the slowest operations you can do on a computer.

I’m sorry, but the difference between "databases are slow" and "the only way I can interact with databases is slow" is only pedantics. I understand that, as a database developer working hard to squeeze every bit of performance possible this is frustrating, but as a developer I couldn’t care less about the difference.

j / k navigate · click thread line to collapse

152 comments

89 comments · 18 top-level

fifilura2y ago· 21 in thread

My colleagues hate me, but I also found that SQL is The way to write business logic.

Lots of caveats about difficulty to test and weird syntax.

But it is just that SQL is the most terse and standard way so express logic.

And that in itself is the most important factor to avoid bugs. Not what testing strategy you choose.

crazygringo2y ago

> But it is just that SQL is the most terse and standard way so express logic.

As someone who has written a ton of complex SQL... I couldn't disagree more.

I have never encountered more bugs in business logic than in dealing with SQL. It obfuscates what should be clear. SQL isn't a solution for avoiding bugs, it's what causes so many of them.

chrsig2y ago

The types of issues you're enumerating are really part of a learning curve that every language and environment will have.

> not having basic functions for calculating percentiles or even a basic median

I don't think this is really true anymore; windowing functions are pretty prolific, and I think every major database will have some percentile functions

1 more reply

fifilura2y ago

I am happy for all the disagreeing posts and that HN is a place for it.

It also helps me understand my colleagues.

It may be how our brains are wired. For me whenever I see a for loop I see bugs.

It is also about that you get the concurrency from map/reduce-type jobs for free without having to think about it.

But yeah there are a couple of places you need to be careful. Avoiding duplicates and handling of NULL.

deadbabe2y ago

Are you using Postgres? It seems like you haven’t discovered CTEs or lateral joins.

2 more replies

buryat2y ago

ndriscoll2y ago

1 more reply

ttfkam2y ago

You can write code in any language without defining functions or modularizing in any way. You can write bad code in any language.

For some reason, folks assume SQL must be written badly since they have only written it badly or seen it written so.

One doesn't write O(n^3) algorithms in C++ and then blame C++ for it being slow. For some reason, folks seem pleased with themselves to do as such with SQL though every day and twice on Sunday.

Got subselects in each of the fifteen outer joins with NULLs all over your schema, and now you're upset performance is horrible and inconsistent? PEBKAC.

ako2y ago

And in any other language this wouldn’t result in multi-thousand lines of code with historical business logic and edge cases?

2 more replies

cma2562y ago

Glad that doesn't happen any where else!

1 more reply

nextworddev2y ago

What you need is a semantic layer on top of SQL

1 more reply

Etheryte2y ago

ndriscoll2y ago

Or they just used the wrong word. I'd agree with them, but clarify that SQL is not just terse, but concise.

A join or a group by is going to be much clearer than writing the code the query plan is going to generate, creating temporary hash maps, doing nested loops, etc.

1 more reply

bob10292y ago

For non-expert team members, modern C# is turning out to be way more intuitive than SQL the moment you encounter a 3+ table join scenario.

devjab2y ago

I think you make some decent points, but at the same time I really don’t understand this part:

> For non-expert team members, modern C# is turning out to be way more intuitive than SQL the moment you encounter a 3+ table join scenario.

Partly because this is so inefficient that it’s never going to work at scale, but also because of how you’re moving the complexity into the realm of “magic”.

As far as data storage goes, however, I think we’re just beginning to see the move away from the classics because the classics just weren’t build to support how we use data in 2023.

hot_gril2y ago

This also kinda depends on having a multi-service architecture if your system is large. Separate database for each. That's a good thing anyway.

whstl2y ago

The fear of committing SQL to the codebase is one of the most baffling things in modern backends.

All that in the name of having small files, small functions, small lines. You take complexity away from the "micro" parts and embed it into the invisible parts of your program.

2 more replies

randomdata2y ago

> And that in itself is the most important factor to avoid bugs.

I'm not sure about that. SQL was the first language I learned and the language that has always been there throughout the decades, and is also where I make the most mistakes.

hot_gril2y ago

1 more reply

vlovich1232y ago

I’ve heard that datalog (and similar languages in that vein) is far easier and terser to write. What do you think about that?

surprisetalk2y ago

I completely agree!

Transaction-safety is also sublime for business logic.

devoutsalsa2y ago

Have you written an ORM in SQL to help make querying easier?

munificent2y ago· 17 in thread

(There are ECS frameworks in JavaScript, which gives you absolutely no control over memory layout and thus completely defeats one of the primary purposes of the pattern.)

Like any pattern, it exists to solve concrete problems. It shouldn't be the One True Way To Think About All Programming Henceforth and Forever.

When the author says things like:

Databases and the relational model are great. ECS is great. Object-oriented programming is great. Functional programming is great. But treat them all as tools that should be used for the right job.

jerf2y ago

"At the same time, the level of hype about ECS today reminds me an awful lot of the amount of hype surrounding OOP in the 90s."

I appreciate that most of the ECS hype has been around specific use cases, though.

I haven't seen anyone claiming everything should be rewritten in ECS.

hyperpape2y ago

> I appreciate that most of the ECS hype has been around specific use cases, though.

I recently watched the first 30 minutes of Mike Acton's 2014 talk, and while that portion of the talk wasn't about ECS specifically, it very much presented an absolutist perspective.

1 more reply

munificent2y ago

> I appreciate that most of the ECS hype has been around specific use cases, though.

hot_gril2y ago

OOP even got mixed with databases to form nasty hype around ORMs and NoSQL.

2 more replies

masfoobar2y ago

"At the same time, the level of hype about ECS today reminds me an awful lot of the amount of hype surrounding OOP in the 90s."

jalk2y ago

When you have a hammer, everything looks like thumbs

morganherlocker2y ago

> (There are ECS frameworks in JavaScript, which gives you absolutely no control over memory layout and thus completely defeats one of the primary purposes of the pattern.)

https://github.com/mourner/flatbush

There are also basic operators required for bitarrays, which are useful for ECS and memory-efficient code generally.

meheleventyone2y ago

zengid2y ago

Your chapter on ECS in _Game_Programming_Patterns_ fueled my own obsession with the power of such a design, so you could be partly responsible ;).

p1necone2y ago

> There are ECS frameworks in JavaScript, which gives you absolutely no control over memory layout and thus completely defeats one of the primary purposes of the pattern.

I don't recall explicitly telling many compiled languages to stick all the items in an array together in memory either - it just happens by default, same as in JS.

Rusky2y ago

munificent2y ago

> it's still fairly safe to assume that arrays are going to end up in relatively contiguous memory

meheleventyone2y ago

matesz2y ago

I dedicated years to full-time work on unifying the architecture for general software development.

ECS, OOP, functional programming, and others serve as methods to organize software — the developer interface. However that role should belong to IDEs — an actual user interface for developers.

Everybody has different hierarchy of their software, esentially developing their own software engine which sucks, because doing it right is extremely hard and time consuming.

For some deeper insights, in the context of the Web, I suggest reading Graydon Hoare's post [1].

[1] https://types.pl/@graydon/110561036237098295

DrDroop2y ago

tlarkworthy2y ago

I don't think memory layout is the sole draw, it's also about incrementally building up programs live.

munificent2y ago

You don't need ECS for that and it doesn't necessarily buy you much. Composition over inheritance can often accomplish the same goal.

1 more reply

hitpointdrew2y ago· 15 in thread

Nope, nope and nope. Went to the github page for spacetimedb, it does everything that is terrible.

> This means that you can write your entire application in a single language, Rust, and deploy it as a single binary.

It also means you have a single point of failure, no read-replicas or redundancy. Hate everything about this.

deadbabe2y ago

It’s not difficult to debug at all, you might just be unskilled.

acdha2y ago

> It’s not difficult to debug at all, you might just be unskilled.

gocartStatue2y ago

rgbrgb2y ago

> Databases should never, ever, ever, be used to perform logic

1 more reply

coldtea2y ago

>and increasing the chance that you will fuck up the most valuable part of your system, your data.

wharvle2y ago

cloutiertylerOP2y ago

This is of course what SpacetimeDB does. The stored procedures fail any transactions that are not authorized to be carried out. e.g. if a player is not high enough level or something.

da_chicken2y ago

> Databases should never, ever, ever, be used to perform logic, they are datastores, that is it.

I wouldn't go that far. Relational algebra is performing logic. Constraints and foreign keys are logic, as well.

wvenable2y ago

> At a concept level, that's exactly the same thing.

3 more replies

Xeamek2y ago

Nothing you mentioned is an inherent problem with the databases. In fact, databases have arguably more potential to solve these problems then any currently used system.

The more I think about it, the more possibilities I come up with. Damn, this is actually genius!

whizzter2y ago

I'm writing a quite similar system to this so I can give the differences why this is useful (esp for this usecase).

paulddraper2y ago

> It also means you have a single point of failure, no read-replicas or redundancy. Hate everything about this.

How does writing an application as a single language/binary prevent read-replicas or redundancy?

The application could do that.

Or you could do it at disk level (RAID).

hitpointdrew2y ago

>The application could do that.

How do you manage multiple instances of the application and not introduce split brain?

> Or you could do it at disk level (RAID).

shepherdjerred2y ago

> you are just asking for a hard time debugging, troubleshooting, and increasing the chance that you will fuck up the most valuable part of your system, your data.

This sounds like a tooling problem. One could imagine a database that doesn't have these issues.

> It also means you have a single point of failure, no read-replicas or redundancy.

What? Why would any of this be the case?

riku_iki2y ago

> Why?

less moving parts, diversity and complexity in your stack (different languages, servers, build/deployment systems, etc).

> you are just asking for a hard time debugging, troubleshooting

what specifically hard about this in your opinion?..

gorgoiler2y ago· 4 in thread

This sort of idea came up earlier this month in another post as well:

https://news.ycombinator.com/item?id=38489307

lgas2y ago

> Has anyone here seen a database where your users are users in the db itself?

PostgREST does this and it was a pretty nice experience when I used it.

https://postgrest.org/en/stable/references/auth.html

hahn-kev2y ago

Yes I have, it was a nightmare. The biggest scaling issue was that on a server you must open a new db connection for each request.

There's also little benefit because you still need to track application user data, so you still need a user table.

sarreph2y ago

Yes, Firebase / Firestore has this capability. You grant access to collections (and even fields) using rulesets[0].

Firestore is a NoSQL db so it's a quite different paradigmatically though.

[0] - https://firebase.google.com/docs/firestore/security/get-star...

cloutiertylerOP2y ago

SpacetimeDB is one such database. It has the concept of an `Identity` which is just a 256 bit unique identifier for a connected user.

paulddraper2y ago· 3 in thread

The relational model is one of those eternal diamonds that will be just as relevant in 2090 as in 1990.

slooonz2y ago

The biggest problem with the relational model is that it’s buried under the dirt that is SQL.

riku_iki2y ago

I kinda think adding some logical/semantic level on top of it (like datalog) will make it even more robust.

vacuity2y ago

Datalog (intentionally) isn't Turing-complete, though. That might hurt applicability for a generalized ECS model.

jitl2y ago· 3 in thread

If the site designers are on here by chance, take a look on iPhone. The blog post is quite squished there, pretty much unreadable.

EDIT: nevermind, fixed after a refresh ¯\_(ツ)_/¯ still maybe worth a look

cloutiertylerOP2y ago

I am on here. Will definitely take a look!

marshray2y ago

While you're here... where's the scrollbar?

I'm on desktop (Edge) and don't see a scrollbar.

Very distracting.

1 more reply

vagrantJin2y ago

another thing. firefox issue.

firefox on android no longer supports mobile view?

1 more reply

rendall2y ago· 2 in thread

Edit: it actually does not allow any unchecking. Just "you agree to these marketing and tracking cookies by using this site" Nope.

Baffling. It does not lend confidence in your core product. If you're not respectful of your casual reader's data, how can we expect you to be careful with your user's data?

I recommend zero tracking, or if you must, have it be genuinely opt-in.

cloutiertylerOP2y ago

Will fix!

rendall2y ago

7 days later, it's still there, unchanged.

cgh2y ago· 1 in thread

Andrew Kelley of Zig fame did an excellent talk on this very subject: https://vimeo.com/649009599

Strongly recommended viewing if you are interested in understanding why DoD is a big deal for performant applications (hint: it's about the cache).

cloutiertylerOP2y ago

This is an excellent talk for sure.

lijok2y ago· 1 in thread

Nezteb2y ago

I agree!

I first learned about SpacetimeDB from a HN thread posted three months ago: https://news.ycombinator.com/item?id=37146952

stuhood2y ago· 1 in thread

The fundamental difference between an ECS and a struct/object layout is that an ECS is column-oriented (aka columnar), while a struct/object layout is row-oriented.

gefjon2y ago

Are you familiar with any existing ECS implementations which expose a SQL interface?

masfoobar2y ago· 1 in thread

I was questioning whether this was the career but, after a few years, decided to give it a go.

In my 20 years, primarily a C# developer to pay my bills but at home my personal projects are written in C, and tried D.. and now Odin. I just prefer the control they give.

(Scheme as well, I do like Scheme)

Anyway - it all comes back full circle now reading this article. Great reading! Now I can just refer people to this link instead.

cloutiertylerOP2y ago

Wow, this really means a lot to me. I'm so glad you enjoyed the article!

samsquire2y ago· 1 in thread

Thank you, I enjoyed this post.

I am thinking that iteration is just traversal and traversal is just execution.

Take iterators in C++ or standard library "algorithms" library in C++. Or iterators in Rust or in Java.

You just want to traverse and collect values (and calls on functions in some interleaving), which is like joining tables as this article says.

In other words, every program is a compiler pipeline or database query pipeline.

Maybe Prolog and Datalog can help here. Optimisation of arbitrary traversals and determination of identical traversals.

Or monads are just ordered traversals and OCaml compilers are just traversals of execution (function application) and relationship following, which are joins.

The relational model is everywhere.

BoiledCabbage2y ago

This is interesting - could you elaborate a bit more?

inopinatus2y ago· 1 in thread

Fortunately there’s Reader mode.

cloutiertylerOP2y ago

Thanks for the feedback, we've definitely got to fix the mobile scaling stuff.

LarsDu882y ago

Once you hit the real world and start making queries of everything, everywhere, all at once, maybe it'll be lock, lock, lock on your "database queries"

And also, it seem like this DB really demands a game engine like Bevy actually get finished!

zubairq2y ago

I really resonated with this article and ended up at the same conclusion myself, even going so far as to replace a homegrown ECS system with a relational database in the browser

workfromspace2y ago

https://archive.is/8uFyy (original website requires JS)

slifin2y ago

There's a good interview with Nathan Marz on his Rama project interviewed by Juxt

He makes the point that databases are today prescriptive on the structure of data you provide to them

I think databases as we know them today are not yet "end game"

slooonz2y ago

j / k navigate · click thread line to collapse