> We are absolutely talking about billions of records.
> But when you say "return excess data to the application" I'm not sure what you mean. Not doing lots of complex JOINs doesn't mean not filtering the queries at the DB at all. Nobody is pulling back a billion records at a time.
The question then becomes "Are you gaining anything by not using foreign keys on the database?" What is the additional speed impact of JOINs actually costing you? How denormalized is your database already, if JOINs are costly?
If you want raw speed at billions-of-records scale, you want as flat a schema as you can get and good indexes are actually going to fit into RAM.
By that point though, you should be able to recognize your use case is not the 90% (or even 95%) case, and your specific requirements are driving doing something different. That's very different than the vague "high-volume" statement you made at first.
My experience has been in assisting clients in migrating to data warehouses and specifically during the heyday of Hive/Hadoop/Spark years ago in seeing clients mistakenly believe they were "big data" and go down the rabbit hole of trying to scale out before it was actually necessary. The problem I have with the vague notion of "high-volume" is that I saw clients with 500m records believe they fit the bill, as well as clients with as few as 20m records who thought the same. The reality is neither of them did and they wound up wasting a lot of money in pursuit of slower systems.
> Here's an example of what I'm talking about - read the first comment on [...], another venue for this same debate.
That's not an example though, that's just some vague statements about needing to profile your query and evaluate the costs for yourself, which should be rather obvious.