At the time I also thought it was obvious it was in preparation for e2ee (despite loud people on HN who disagreed).
I do wonder if they had intended to have it be default on though, maybe not since probably better for most users to have a recovery option.
By definition, encryption (with unique user keys) means you can't infer nor check what the content of the message is. Not without client cooperation, which is what this feature would have been.
> “Convergent encryption solves this problem in a very clever way:
“The way to make sure that every unique user with the same file ends up with an encrypted version of that file that is also identical is to ensure they use the same key. However, you can’t share keys between users, because that defeats the entire point; you need a common reference point between users that is unknown to anyone but those users.
“The answer is to use the file itself: the system creates a hash of the file’s content, and that hash (a long string of characters derived from a known algorithm) is the key that is used to encrypt said file.
“If every iCloud user uses this technique — and given that Apple implements the system, they do — then every iCloud user with the same file will produce the same encrypted file, given that they are using the same key (which is derived from the file itself); that means that Apple only needs to store one version of that file even as it makes said file available to everyone who “uploaded” it (in truth, because iCloud integration goes down to the device, the file is probably never actually uploaded at all — Apple just includes a reference to the file that already exists on its servers, thus saving a huge amount of money on both storage costs and bandwidth).
“There is one huge flaw in convergent encryption, however, called “confirmation of file”: if you know the original file you by definition can identify the encrypted version of that file (because the key is derived from the file itself). When it comes to CSAM, though, this flaw is a feature: because Apple uses convergent encryption for its end-to-end encryption it can by definition do server-side scanning of files and exploit the “confirmation of file” flaw to confirm if CSAM exists, and, by extension, who “uploaded” it. Apple’s extremely low rates of CSAM reporting suggest that the company is not currently pursuing this approach, but it is the most obvious way to scan for CSAM given it has abandoned its on-device plan.”
https://stratechery.com/2022/apple-icloud-encryption-csam-sc...
The file encryption part was based on using a hash of the file as the key.
It's always nice to later find out that one's quick amateur idea turns out to be an independent rediscovery of something legit. Now that I've learned it is called "convergent encryption" Googling tells me it it goes back to 1995 and a Stac patent.
So every file has a unique key? So thousands or 10,000s of keys would need need to be in the keychain, mapped to the file name.
And if one person's keys are leaked, they can be used prove that other people had the same file
No, this doesn't sound well thought out
I thought the same.
> despite loud people on HN who disagreed
Yeah, loud people be like that, but this is really Apple’s communication fault. They could have started with that “hey we want to provide e2e encrypted storage, the price of it will be that we need to scan what you upload for csam”.