If you for instance save all the user data like user preferences under a random userId, and then delete the personal data (such as email address, name etc.) associated with the userId I would expect this to be GDPR complaint without having to do a cascading delete.
It's a law, not a technical constraint. No one gives a fuck about some foreign key relations, they care that personal data cannot be accessed, or somehow reconstructed.
"Anyone can design a lock that they themselves can't pick"
If you think you have anonymised data sufficiently you may well not have done it sufficiently to prevent others from re-conctructing it:But on the topic in general, could someone explain to me what the real world consequences are likely to be for a small business not based in the EU, of not complying? If I've never cared where my users were as long as their payments cleared (oh, is that where they get you? the payment processor?), and I'm selling handcrafted bobbins online in Canada without letting people delete their email address, what is likely to happen if someone complains to EU authorities?
Databases such as Cassandra are made so that updating doesn't actually delete the old data until some time later so frequent updates will degrade performance and storage. Other databases that allow for immediate overwriting the data will cause fragmentation and thus performance decline and wasted storage until you compact (basically recreating the entire database) which is not something you want to do all the time, especially on SSDs.
1. GDPR gives you 40 days to respond. You don’t have to run VACUUM everyday.
2. The entire point of my post was acknowledging that there are costs to being GDPR compliant, and why it’s responsible to have that cost.
If it takes a week to garbage collect that's fine, it just can't stick around forever.
> "you could easily not switch to a CASCADE, but instead set delete=1 and mark every sensitive field with a special value"
Emphasize on the part after “and”
I can see two reasons why this would be a problem:
You have a really shitty un-normalized database design. Granted that you may have to denormalize specific columns for performance reasons. But why that would be the case with, for example names, phone numbers or sexual preferences, totally escapes me.
Or, you're referring to actual cascading deletes, meaning that you need to get rid of child relations, based on deletion of the parent relation. If this poses a problem then I'd argue that you're guilty of a shitty database implementation, arguably with criminally bad definition of your primary / foreign key pairs.
I really don't see a problem here, unless the database schema is implemented in a totally incompetent manner.
Edit: Clarity
But often all you need to do is overwrite the name, address, or similar bits of information, and you can then leave the rest of the data intact and set your delete flag.