But scrypt can be over 100,000,000 time stronger than MD5 -- so if you're using scrypt you can afford to use a password which is 100,000,000 times weaker. "jdtwbv" hashed using scrypt is stronger than "H.*W8Jz&r3" hashed using MD5.
Is it? I'm not sure.
for the first one you're using lowercase letters (and digits, I'm giving you that 'free')
For the first one we have 36^6 For the second one (all printables) 100^9
Relation between them: ~ 459,393,658. If you're saying scrypt is 100M times better, in this case the second one is safer
And the relation is important but less as computers get faster. Option B may take 1Mi times as long as Option A but if Option A takes 1 microsecond, there goes your Option B as well
Oops, you're right, I got the math wrong when I looked at the table in my paper. I should have said that scrypt can be over 100,000,000,000 times stronger. ;-)
I don't really keep up with that game (like WoW, it seems like a fun game, but only if you are willing to put in a lot of your time), but I think the current limit is somewhere around 8 or 9 characters if you are pulling from all printables, meaning that "H.*W8Jz&r3" with MD5 is probably not breakable right now.
Take off two characters, or wait 3 years, and it probably will be.
But "password" is still grossly insecure in either case, it'll still be the first thing that someone performing a dictionary attack will try. Never tell people how good your key derivation function is, lest they misunderstand and think it means they don't have to chose a non-obvious password/passphrase.
Steube misunderstands the xkcd comic [1]. There's a really good comment which explains it: "It could be argued that Randall's example of 4 words is too short -- and indeed, for some applications, it is. However for a typical dictionary size, and genuinely random selection, it is massively stronger than "typical" passwords and in fact easily adequte to defeat the above-mentioned attacks." [2]
Emphasis on "genuinely random selection."
[2] http://www.schneier.com/blog/archives/2013/06/a_really_good_...
I think Schneier's suggestion of reducing it to the first letter of each word is vastly preferable because it packs the majority of entropy from random word selection into the least amount of typing.
salmonellaeater is right, Steube misunderstands the comic. The idea of the comic is to pick a small random selection of the 250,000 distinct words in a oxford dictionary, rather than 8 of the 95 letters from all ASCII printable characters. A selection of 3 words has then higher entropy than 8 random characters, because 250,000^3 is a bigger number than 95^8. The question then is, will 3 random words really be easier to remember than 8 ASCII printable characters?
The downside to the Schneier scheme, is that each is a common sentence (low entropy), with a chosen transformation algorithm added. Thus the quality of the password will depend on the number of transformation algorithms, and the quality of each one. If we are to use the one first described to create "tlpWENT2m", we get a password strength like:
Using strictly the first letter, would only do 2x linear increase in entropy over just searching for common sentences. Change any occurrence of common numbers substitutes for words adds (0-2x) entropy increase. Writing one of the words in all caps means 6x increase in entropy. Combined, tlpWENT2m is slightly less secure than "This little piggy went to market" + two [random number below 10] or a single letter at the end.
There's a list of 7776 words, everyone knows what words are on the list. I suspect that sometimes people re-roll because they don't like a word or don't think they'll remember it. But I don't that that makes much difference.
For example: "and in the swept plains of winter's vale, our hero did beseech the emperor to send for his forces" -- what would be the difficulty in cracking that, given that this isn't a quote from a book or anything, but just a sentence that popped into my mind and seems easy enough to remember?
Take a list of 6^5 words. Roll 5 dice. Take that word from the list. Do this 4 more times. You now have a five-word passphrase like "moire fraud 80 row bernet".
Even if someone knew the exact method and list you did to get that passphrase, there are 28430288029929701376 combinations, giving you over 64 bits of entropy.
Someone has probably tried to rainbow table all those results for MD5. If a core can do 1 billion hashes per second, it would take 900 core-years to build a complete list of all those combinations, which is probably feasible for a small group to put together, but messing with the list just a little bit or adding a 6th word would likely put you past that even for a crappy MD5 hashing.
If you won't ever find "gonefishing1125" using brute force, how on earth did they find "qeadzcwrstxv1331"?
I imagine there are a whole bunch of these geometric patterns, and different combos of them are tried.
What irks me is that every OS in use today has support for strong cryptography and browser vendors could easily integrate that. We would no longer register for a website, we would simply upload our "Online Identity" or whatever we called it. This of course is just an id_rsa.pub with maybe name and email in the comment. The remote site stores the public key and the browser authenticates using the private key, stored securely in the keychain.
This has the potential to be invisible to users, and thus used by default, and highly secure since the local keychain can generate incredibly strong keys, all behind the scenes.
Like SSL client certificates?
I'm 100% with you, it would be a major step forward - but it's too inflexible for Joe & Jane.
A good algorithm would take n bits and map them uniquely to a set of strings that are easy to remember for a human. The apg utility does something like that.
I'm creating an online system that will store users' sensitive financial data. When setting up an account, the user will have to choose a password as normal, but will also be given a passphrase of the form "correct horse battery staple" that they must write down. To log in, the user will need to enter (a) username; (b) password; and (c) passphrase.
It is effectively a poor man's two-factor authentication - the second factor being the piece of paper containing the passphrase. I think it strikes a good balance between security, convenience and cost.
What do others think of this approach?
Authentication devices for TFA are designed, so that you really have to have the device close to you when you do a login.
Due to lock-in effects, people have to deal with all manner of usability hell from their bank, but the same logic doesn't apply to startups. Not that your idea is usability hell, but you probably don't want to make it any harder than it needs to be.
I think adding a few characters to the minimum password would be equally secure and more consistent with tooling, as well as a more familiar model for users.
Also, 2FA might be easier than you think using a service like Twilio. Or another way to do it would be to let the user connect via a service that does support 2FA (e.g. Google or Twitter;and maybe adding your own password if you want to harden that).
Do you mean saying to the user, "your password must be at least 12 characters long"? That would just result in the user adding "12345" to the end of their standard password. Still seems much easier to crack than 4 random words.
> Also, 2FA might be easier than you think using a service like Twilio.
It might be easy for me to set up, but for my users (who are mostly non-technical) it is still relatively painful to install and set up a two-factor authentication app. I think most of my users would prefer the write-down-four-words option, even if it is a little less secure.
> you probably don't want to make it any harder than it needs to be
OK, so the question is, "does it need to be harder than the standard login form of username and password field?". Since my system deals with sensitive financial data, and given the problems with allowing users to pick their own password, I would say the answer is "Yes"
You should rate-limit login attempts on the live site. Even allowing only one login attempt per second kills any brute-forcing attack if your passwords have even mediocre complexity.
Password cracking is only really a threat is the bad guys get your database. And if they do, it's not much more difficult to crack two passwords than one.
The point of true two-factor is that the second "password" which comes from the device is never stored in your DB, so it cannot be cracked. That is not true of your approach.
You say, "it's not much more difficult to crack two passwords than one" but I don't see how that is the case if the second password is four words chosen at random from a dictionary of say 5000 words. Such a password is far more difficult to crack than the average password chosen by the average user. Having a second passphrase generated by a computer also eliminates the problem of users re-using the same password between sites, or choosing "letmein" or "password1" as a password.
However, I'd be careful about thinking of it as any sort of 2-factor authentication and wouldn't bestow any of the advantages of 2-factor auth on your scheme.
A static secret, no matter how complex, doesn't really prove ownership because multiple people can trivially have a copy of the secret at the same time. So you don't have a knowledge and a physical factor, just a convoluted knowledge factor.
Better than just a password, but don't let it g e you a false sense of security.
The first has a key-space of 36^12 (36 possible characters in each of 12 positions), or about 4.7e18. The second has a key-space of 62^9 (62 upper/lower case letters and digits in each of 9 possible locations), or about 1.4e16.
If, in addition to adding the uppercase letters, you added the possibility of needing to test symbols, such as ~`.,/:;!@#$%^&*-=_+ (another 19 symbols), and changed the latter password to "tlpW#NT2m", then the searchable key-space for all 9-character passwords becomes 81^9, or about 1.5e17.
RE-EDIT: Sorry. I should have read the article first. I'm not sure why the latter would be more secure. Obviously "WENT" would be in a dictionary, so I'd think that "tlpWENT2m" would fall to a combinator attack very quickly, too.