If such a database exists, could you imagine the public outrage that would result if someone hacked/leaked it? I bet there are a bunch of images that only a handful of people ever had access to documented by our government...
[1]:hash might not necessarily be something like MD5, it might be a more sophisticated photo fingerprint.
Edit: Anyway, I have no idea why is this on HN again, nth time. It's old news and also automated photo fingerprinting is not exactly a huge issue for cloud storages where you store unencrypted data on the first place. It's just another machine reading the bytes of your data, and those bytes have been read by many other machines as well. They don't even use this for advertising or something that could make you a product. They don't even check for pirated software or stuff like this, so this is not exactly groundbreaking news.
When local police get a warrant to do a forensic search of computers the judge often requires that they search harddrives using known image/file hashes - as opposed to looking through each file by hand - in order to protect that persons constitutional right to privacy.
This right primarily applies to personal computers and doesn't exist (as clearly) as defined by law when the files are stored on a remote cloud service. Although whether cloud services have a similar high threshold of assumed privacy is a recurring debate in criminal law, as storing your whole life in cloud services is a relatively new phenomenon.
I also doubt that if they get a warrant, they'll only scan the drives for the hashes. That seems highly unlikely unless the FBI/prosecutors themselves specify in their request to the judge that they only want to do the hash scanning - but I don't see why they would want to limit themselves to that, when they know the judge is likely to give them full access to it, if they already have some evidence of "criminal activity", without which I assume they wouldn't even bother getting a warrant.
It depends on the case I guess. I heard this from a lawyer I know who has worked on CP cases. Judges always try to limit the privacy exposure in every warrant. By doing so the police can get access to warrants easier since it requires a lower threshold of probable cause than say searching an entire house.
Especially if the police found the person because they used a P2P site to download a particular file or had a hash in their email account. A minimal warrant to search for that hash is much easier to defend in court against arguments by the defense of constitutional privacy violations of a full data search.
The police have to do this minimization process in phone wiretaps as well. They aren't supposed to listen nor store the phone call unless the information is relevant to the case.
These limitations are clearly defined in warrants. It is up to the police and forensic experts to follow it of course. But I'd imagine in high profile cases the Judge/FBI wouldn't think twice about finding it appropriate to look at every file.
I remember (but can't find for the life of me) an article somewhere that stated you don't want to put something like a Truecrypt volume on a service that does versioning, since the changes in the encrypted data each time you change something can be used to leak data.
If it does not change often, then it does not leak much information, naturally.
The TrueCrypt docs warn about using a dynamic volume [1]. I'm not concerned about leaking info about which volume sectors are unused.
[1] http://andryou.com/truecrypt/docs/creating-new-volume.php
2. Rename the file .DS_Store and copy to victim's cloud folder
3. They go to jail and get killed by another inmate a year into their sentence. Justice prevails!
You can learn more about the tech here: http://en.wikipedia.org/wiki/PhotoDNA