That said, this article is incorrect on at least one point: de-duplication does not require Dropbox be able to decrypt your files. tzs came up with this clever scheme in a previous comment: http://news.ycombinator.com/item?id=2461713
Of course even if Dropbox didn't have the keys to decrypt your files you're still trusting them (or SpiderOak or Wuala or most of Dropbox's competitors) by running their proprietary software. But I suppose people are more concerned about subpoenas and compromised servers than malicious actions by Dropbox themselves.
You don't have to trust tarsnap. Block encryption happens client side using open source software.
For instance, say you are a whistleblower and you have a stash of documents nobody should know you have. Your opponent, having a copy of those documents, can produce an identical encrypted file. What's more, Dropbox obviously already has a mechanism to look up digests so checking wheter the document is stored with Dropbox or not is probably a matter of milliseconds.
Also, as someone pointed out, deriving the key from the cleartext is probably a very Bad Idea.
The only workable approach I can think of is to encrypt and decrypt data on the client. Any scenario where encryption takes place on the server is suspect.
Well, fails in that the current operation of Dropbox was a clue that they were not employing anything like tzs's system.
Web-based upload would have required three time/CPU intensive steps: Hashing the contents of the file to create the related encryption key, encrypting the file with said key, then re-hashing the file to get the new "fingerprint" of the encrypted version.
The reason this would all be required is that Dropbox is not only worried about deduplication for storage - they're worried about it for bandwidth saving. When you upload a file to Dropbox that they've "seen", they give you instant upload credit for it and skip the entire process (which users enjoy for the speed and they enjoy for the pocketbook in bandwidth savings).
With an appropriate hash and a block ciper choice, yes, they could do this without creating a client-side duplicate/encrypted file. They could do it as they go (hash the entire file, start encrypting it into a short buffer, then start hashing the buffer) - but if they're not storing a duplicate/encrypted copy locally, they're going to have to re-encrypt it as they go (again) during the upload phase.
So that's: One hash for the key, one encrypt + rehash for the fingerprint, then one more re-encrypt for the upload.
... and then you would run into the other problem I mentioned - where you also have to download the remote database, decrypt it locally, add your new entry to it, then re-upload the encrypted version. All from the web client.
Considering how quickly the upload starts - it's pretty obvious they're doing nothing even remotely like this.
(I understand that we know for a fact Dropbox has access to the files on their servers - I just wanted to expand on the proof that, based on how they were operating, that they couldn't possibly not have access. But then, I've been saying this all along.)
That's not semantically secure. Anyone can distinguish whether a particular plaintext produced a given ciphertext.
What are the practical implications? The only one I can think of is if someone were trying to prove in court that you had a particular file.
A related question I've been wondering about is whether a hash or key+ciphertext are considered proof of possession of the original file. In theory an infinite number of files have the same hash, and the ciphertext could have been produced using a different key (or it could just be random data). In practice it's extremely unlikely (for sufficiently secure hash and encryption functions). What are the laws'/courts' views on cryptography?
I use dropbox and I appreciate it, but I find communications from you and Arash to be strangely borderline ... off.
A discussion from three weeks ago is not "an old issue." That these issues made it to the FTC in three weeks is probably a remarkable benchmark.
As Arash did several weeks ago, wrt to the password and key issue, you seem to miss the point and almost intentionally dismiss the issue by detailing it as "old issues."
These issues are anything but old, even in internet time.
I would value Dropbox much more than I do if I found that you and Arash could speak with the integrity I find from so many other entrepreneurs.
1. Can Dropbox access user files? (already answered, yes)
2. Did you ever claim otherwise?
3. If so, why? And how will you appease any users that were misled? If not, where is the flaw in this complaint and various others?
The kind of vague PR-speak in that blog post will only irritate this crowd and drag the whole thing out. IMHO, the fastest way to make this go away is to take a firm position and state it clearly. And if you don't want to do that then it's probably best to say nothing at all.
Accessing files from the website and file previews can use a key that I give you when I login. The only one where you need to have a key is if I share the file with someone. Can't you throw up a prompt when I elect to share a file that says, "This file will now be accessible to person XYZ. In order to do this DropBox will re-encrypt with our own private key"?
I suspect a lot of people don't share most/all of their files with anyone. It would be nice to have privacy by default and then opt-out when they decide to share it.
Dropbox advocates TrueCrypt in one breath, but refuses to integrate client-only encryption keys in the client with the next breath. Obviously they know what we all know: TrueCrypt presents a poor UX for non-technical users, and so most people won't use it even if it's recommended. Then DropBox gets to be the hero for advocating TrueCrypt while they get de-dup efficiency because they know few people actually bother using TrueCrypt.
Any transfer of keys to dropbox, even temporally, means your data on dropbox should be considered insecure. You have no way to know what's going on on the dropbox side. The same issue arises for CAs that let customers generate SSL keys within a web interface, to avoid the nuisance of having to generate a key/cert/csr themselves and uploading that. It doesn't matter if the CA promises never to store the private key. It's insecure and it's bad practice.
If they can't decrypt it, they can't deduplicate it.
If they couldn't deduplicate it, the costs would be higher.
Higher costs => higher prices
Still need a lot of improvement but It's good. Built on different philosophy, so the workflow is slightly different, however in some areas (what to backup, what to sync and how) has more flexibility.
One missing feature is syncing directories with others. Probably the architecture that gives the security makes this a difficult task.
To compensate for that, there's a pretty good feature of web-share. Can share anything you backed up, with a separate password that you can revoke any time.
So in the end: I keep Dropbox but removed most of my files from there, started with SpiderOak, and playing with AeroFS (similar functionality on peer-to-peer architecture).
...charges that Dropbox “has and continues to make deceptive statements to consumers regarding the extent to which it protects and encrypts therir data,” which amounts to a deceptive trade practice that can be investigated by the FTC.
I like Ryan Singel. He was the first reporter to write about YC. So I can't really begrudge him the pageviews he knew he'd get from HN over this. But 'taint really news.
But I'll disagree about the news value. The complaint's allegations about what Dropbox promised, versus how the architecture actually works, are pretty strong.
Soghoian knows his tech and he knows the FTC (he used to work for them). I like Dropbox. I use Dropbox. But the blog post Dropbox keeps pointing to doesn't explain the discrepancy between what users were told about security/privacy and how the service works in practice (centralized encryption keys).
It is referenced in both as a complaint and is news for wired readers, not necessarily here (though the FTC complaint being filed here is new).
The FTC regulatory model depends on market participants to tip them off when companies are breaking consumer law, so this could be the first step in an eventual action against DropBox.
Dropbox seems to have decent transport security, which is more than 99% of apps. Keeping files in dropbox might also keep unencrypted and out of date copies from persisting on local drives, random USB thumb drives, borrowed computers, computers at the print shop, etc.
Yes, someone who compromises dropbox, or a rogue high-level dropbox employee, or a law enforcement officer with a warrant, could get access to your files on disk at dropbox. Dropbox is probably not the weak link, though.
For a normal user (individual or corporate), trusting dropbox is not any worse than trusting gmail or anyone else who has your data and has market, legal, and other reasons to keep it safe for you. I've met with a bunch of top-tier attorneys in the past couple weeks, and none of them want to mess with PGP; they trust that if gmail or an outsourced exchange provider were snooping on their messages, there would be legal recourse; sure, it's an issue if there's no way to prove it, but generally they are pretty trusting of major service providers.
I personally don't use dropbox for anything except "public" files, because I try to constrain long-term storage of my data to my own infrastructure, or something encrypted end to end and fully under my control. However, dropbox is probably a cut above the effective level of security most organizations or individuals have in practice.
I'd sure prefer if dropbox did client-side encryption and never had access to the keys, but then you'd also need to trust that the dropbox binary doesn't secretly send your password to Russia, and that no future version of the dropbox binary that you use has the send-to-Russia feature added. And, you'd need to trust that none of the devices from which you access dropbox has been keyloggered, trojaned, etc.
Of course, dropbox seems pretty robust in terms of availability; I just lost an SSD which didn't have timely backups of certain files, something which is going to ruin my weekend and which would have been avoided had I been less paranoid and used dropbox more.
(and, I'm working on solving the issues with trusting remote services, actually...)
Unless you're using the mobile app.
| For a normal user (individual or corporate), trusting dropbox is not any worse than trusting gmail or anyone else who has your data and has market, legal, and other reasons to keep it safe for you.
Except Google doesn't (afaik) lie about their ability to en/decrypt or view your personal data.
Not so fast. How big is the binary diff when you change a file within a Truecrypt volume? Ie, how much Dropbox bandwidth will you be using, even with a small change?
I performed the following experiment. Start with a 250M Truecrypt volume. Mount it. Create a 1M file from /dev/random. Unmount the volume.
Now, look at what blocks in the Truecrypt volume file have changed. Dropbox uses a 4M blocksize [1].
seq 0 63 | while read i; do
{ dd bs=4M skip=$i count=0; dd bs=4M count=1; } < truecrypt.bin.before | md5sum
done 2>/dev/null | sort > md5s.before
seq 0 63 | while read i; do
{ dd bs=4M skip=$i count=0; dd bs=4M count=1; } < truecrypt.bin.after | md5sum
done 2>/dev/null | sort > md5s.after
comm -13 md5s.before md5s.after | wc -l
8
Conclusion: in this case, Dropbox will transfer 32M (8x the normal 4M) because I added a 1M file to my Truecrypt volume. Note: I haven't tried adding bigger files, but suspect the number of blocks changed will go up linearly but steeply with the size of the added file.It's not actually that surprising that TrueCrypt mixes file changes throughout the volume file.
Why bring this all up? Because something that does client side block encryption (tarsnap is an example) would only transfer the affected block. 4M, if that's the block size.
And you don't have to trust the cloud storage provider at all.
EDIT: My pipelines were wrong on the first go, suggesting a much larger number of differing blocks. Sorry about that.
I know you will say that the hashes are long enough so this should not happen until dropbox has trillions of files, etc. But those calculations are all based on assumption of random data in the files. We all know that various computer files may have structured and patterned data. It is possible for the data in certain types of files to be structured in such a way as to produce a much narrower range of possible hashes than generally assumed.
And with 25 million users and hundreds of millions of files, God knows what may happen.
Hashing is not the same as compression - we should all know that by now. Pigeonhole principle and all that.
In a 256 bit hash, there are 2^256 possible hash values. There are far more than 2^256 possible values that can be hashed. Therefore, hash collisions are inevitable.
There is no way to take a hash and expand it back to a unique original value, or it would be a compression algorithm, not a hash.
http://www.forbes.com/forbes/2010/1206/technology-chris-sogh...
http://www.thedailybeast.com/blogs-and-stories/2011-05-13/fa...
He slags off PR guys, but his goal is PR for himself.
You're definitely correct about him being a PR seeker. I'm not too familiar with this fellow, so if he is slagging off PR people it would certainly be hypocritical, but so what? I think this guy is providing a valuable service by exposing misconduct by tech companies.
That Dockstar currently turns on and blinks amber. Seagate tells me it is dead.
That and the puny bandwidth of my cable connection and the costs of backing up my diskdrives, leads to a desire to outsource that pain.
So whether it is Dropbox, Google, Wuala, or many other solutions, I would like to find reasonably secure cloud storage, and I would be willing to pay for that (and I do.)
Cloud storage isn't necessarily bad, but it's expensive considering the cost of storage these days and it might be a good idea to encrypt anything that you don't want to become public information before uploading it onto a cloud server.
Mind you, expecting secure against law enforcement or government agency disclosure storage is _also_ missing the poit of Dropbox.
Now, I expect that for all intents and purposes the encryption/security employed by Dropbox is 'good-enough' that I dont have to worry about random-internet-user gaining access to my docs, yet I have absolutely NO illusions that ANY company will refuse to hand over my data to the feds should the feds be seeking it.
Further, I would suggest that anyone with anything they dont want the feds to know about/get their mitts on not be stupid enough to store said sensitive secrets IN THE FUCKING CLOUD
Additionally, I can understand that Drew may not be the most savvy in navigating such issues given him being a young CEO and all - and I can understand that he would want all the DropBoxians to feel comfortable with the safety and security of their data in his hands - but I would like to see a frank, real-world answer to any security claims which delineate in no-uncertain-terms exactly what level of data safety, security and encryption one may expect.
Drew may even do well as to explicitly say "We shall not refuse to hand over any of your data (and its revision history) to the Feds should they come seeking it with legal merit."
If, after such a statement people are concerned about their data going anywhere -- they should get off dropbox / implement truecrypt as stated.
Finally, a question for Drew: given this craptastic event; would Drop Box be open to much more robust file encryption tools being developed as an addon to DropBox; e.g. a third party wrapper application that allows end-to-end encryption while still allowing the web UI etc to work?
(If I misread the circumstances of the whole issue - forgive my little rant)
No, the concern is that Dropbox led people to believe that by use of encryption, Dropbox was preventing user files from being accessible to anyone except that user, which isn't actually true, and that Dropbox gained unearned competitive advantage because of that untruth.
Technically-savvy users who know (or more to the point, care) how Dropbox works behind the scenes may be able to figure out that user files had to be accessible to Dropbox (the "but how could they de-dupe files?" argument). Bully for them, but the fact that some people understand why an advertising claim is misleading doesn't make it okay for that claim to be misleading in the first place.
"New TOS:
Compliance with Laws and Law Enforcement Requests; Protection of Dropbox’s Rights. We may disclose to parties outside Dropbox files stored in your Dropbox and information about you that we collect when we have a good faith belief that disclosure is reasonably necessary to (a) comply with a law, regulation or compulsory legal request; (b) protect the safety of any person from death or serious bodily injury; (c) prevent fraud or abuse of Dropbox or its users; or (d) to protect Dropbox’s property rights. If we provide your Dropbox files to a law enforcement agency as set forth above, we will remove Dropbox’s encryption from the files before providing them to law enforcement. However, Dropbox will not be able to decrypt any files that you encrypted prior to storing them on Dropbox."
That's actually a lie too. Dropbox does encrypt your files, it's just that, naturally, they hold the key. If I ask Dropbox for another users files, guess what? They don't hand them over.
If your info is really that sensitive then for heavens sake don't outsource encryption and key management to a third party you have no supervision over. Encrypt your super sensitive files with Truecrypt and then share/sync them with Dropbox.
Even if dropbox's claim was technically correct, it was absolutely misleading. When you say "this data is encrypted" people assume that you mean "... in a way which adds security"; if the same people who have access to the encrypted data also have access to the decryption keys, you might as well be using ROT-13.
If I ask Dropbox for another users files, guess what? They don't hand them over.
Modulo the recently-fixed vulnerability which allowed you to download data if you knew some hashes, that is.
Seriously, who cares where the "complaint" was sent? Either it's a valid argument or it's not. Where it was sent should have no bearing.
The argument that Dropbox did this to save money is transparently bogus too. That's in there to make it seem like the FTC has grounds for getting involved. Dropbox clearly chose to store keys themselves so they could offer core features like web/pubic sharing.
Grandma: My computer crashed, I got a new one, but I've forgotten my dropbox password. Dropbox: Okay, can you go get the printout when you registered saying "print this out and never lose it"? Grandma: I didn't print that out / I lost it / the house burned down. Dropbox: Sorry, then you're screwed because we designed our service to be usable only by those who have a deep understanding of computer security.
But even then, if Dropbox never stored the decryption keys on their servers anywhere, and the decryption key was stored only on a client PC, and I lost my computer, I would not be able to access the backed-up data from Dropbox on a new computer. That would kind of defeat the purpose of Dropbox for me. As many others have pointed out (including Lifehacker) you can always use Truecrypt to put some stuff in your Dropbox that no one but you can decrypt.
As far as the "feds" getting my data, if they are after me, they can get a search warrant from a judge and come into my house and confiscate all of my computers, which would allow them to access any data on my harddrives not encrypted with Truecrypt...
Personally, every file in my Dropbox is unique... I wonder how many people use it for storing deduplicatable content like mp3s and videos etc.