The whole point of salting is to reduce an attacker to checking one possible user/password pair at once. You don't get this benefit if you use the method described in this article.
The only attacker I can see this defending against is an attacker who has a precomputed table of password MD5s and doesn't have (hasn't bothered to create) a precomputed table of password-repeated-twice MD5s. So technically the method described is slightly more secure than simply storing (user, MD5(password)) pairs, since it avoids the "google to reverse MD5 hashes" pseudo-attack. But if you're going that low, you're pretty much lost already.
Don't do this.
Assuming a scenario of using a single MD5 hash action on a user password, and a user with a password of "password1", no salt means that the simplest dictionary + number rainbow attack will find your password early on.
Add in a randomly generated multi character salt, and now you'd need a 100% coverage 12 or 16+ character/byte rainbow file, which even with a single MD5 sum hash is I believe still well beyond the abilities of a modern comp farm.
So while I don't recommend using MD5 anymore for password hashing, a randomized multi-character salt still makes the difference between easily hackable, and virtually impossible to extract the password.
Or am I missing something?
You're absolutely right that a slow password hash should be used -- I left that part out because I wanted to make the non-saltiness of the "salt" clear without confusing people by talking about all the other things that they could do wrong.
If an article like this makes even a couple of neurons in your head light up, you have better things to do than reinvent the password storage wheel. Use somebody else's good system. This is almost security 101.
I agree, in the sense that people don't spend a lot of time thinking about breathing. We just do it. Grab 256 bits from /dev/random (or from your internal entropy pool, if you have one) and stick a salt or nonce onto password you hash and every protocol packet you send.
Of course, most of them wouldn't waste their time redesigning a password storage system; they'd use Mazieres/Provos "bcrypt" and be done with it.
For a straight-forward "users logging into a website", absolutely. But most people who design authentication systems have more stringent requirements, like "allow users to authenticate themselves in such a way that a bogus server can't steal their credentials", at which point things get a bit more interesting. :-)
At worst, you're going to make an attack available only to a person with a botted cluster of machines accessible to a single guy with a laptop. Meh.
http://bcrypt-ruby.rubyforge.org/
It even handles the salting for you, so you don't have to think about it.
This is combined with another salt I keep in my codebase.
If the data in the db gets out, they still do not have salt stored in my codebase (unless the entire server has been compromised).
Right or wrong? Any thoughts?
a) user enters username/password
b) server looks up in a file the unique string and appends it to the password
c) server finds username's salted-password hash
d) server extracts hash from stored salted-password hash (possibly the first 4 characters or whatever)
e) server concatenates salt, plaintext password and unique string and hashes via sha-N/md5
f) server compares to stored salted-password and if same performs login. otherwise sends generic "invalid username/password combination" message.
Then why not salt your hashes using some kind of lookup table on the password? With a bigger lookup table, you have a better chance of having a unique salt for most of your users. In fact, the bigger the table the better. But that takes up space, so why not use a function? But what kind of function? It should be cryptographically secure. Wait, I know, a hash function would be perfect!
Basically, this is just adding another gimpy home-grown round to your hash function. It will make the attacker's job slightly harder, but as others have pointed out, you can still match any password from the file.
I conclude that the best thing to do is to use a hash of the username as the password's salt.
Given that usernames are public - you know, for instance, that mine is jem - what level of security would that provide?
Hashed passwords are typically attacked using a rainbow file. Basically you use a dictionary file to hash words from the dictionary file and store the plaintext word and the resulting hash together in a file/db/whatever. After you've hashed every word in the dictionary, you start doing combinations: wordword, wordnumber, wordnumberword, etc... Of course each time your add another element, you're exponentially increasing the time it takes to generate your rainbow file.
Using a multi-character hash is basically adding another element to the password on behalf of the end user. Basically if we say it takes 100 hours (totally made up number, I suspect it's much higher-depending on password rules) to generate a comprehensive enough rainbow file to find an average password, if you add a 1 character alpha numeric salt, it now takes 3,600 hours to find that password. If you add 2 characters, it now takes 129,600 hours. And so on.
Using a public username (and having your attackers know that's what you're using) means that you have to generate a full rainbow file for EACH user. So while it's not as good as a decent sized random salt, you can no longer take 100 hours of work and hope to extract 8,000 passwords from a site's password db/file. You have to do 100 hours of work PER user, and still have a 30-50% success rate. So to get the same 8,000 passwords you'd need to do 1,600,000 hours of work (assuming a 50% hit rate).
So assuming you personally are not a specific target, the site itself is numberOfUsers* more time consuming to extract X passwords from, than a site that just uses MD5 with no salt.
The problem with this is you have created a pattern which could (and probably does) introduce a cryptographic weakness. If someone knew or could guess you were using this tactic then they might be able to exploit it.
While I don’t have an example of a specific weakness to MD5 to hand one of the basic rules of exploiting algorithms is knowing some of the source material or knowing about patterns within it.
The point of using a salt is to use a piece of unknown material which is unique in each string. It’s common to generate random junk to use rather than anything meaningful.
Personally I always take "new insight" into security tactics with a pinch of salt until they have been hammered on by a few people.
find rainbow table with a char space of
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQURSTUVWXYZ0123456789!@#$%^&*()_+=-';:"`
you now have a working hash map of 27-80gb of data.
upon looking through the user/pass table you see that the user/pass table has a salt key for each record. at this point you know that some portion of the passwords will be reversible with the salt being a confirmation of finding the correct password.
i know some of the rainbow tables out there are really fast at looking up a reverse md5 or sha1.
pass look up in rainbow table you will at least find a collision, and if you are lucky, you will find the pass concatenated to the salt entry in the table.
so if a hacker has your user/pass table, yikes.... ;-/ you are better off restricting read/write privileges to separate services. i would hate for a hacker to connect to your DB and to write their own salt and pass.
we were working on this for one of the defcon games last year.
NEVER REPLACE OBFUSCATION FOR REAL SECURITY.
fighting rainbow tables requires larger hash functions that generate larger hashspaces that are less likely to fit on a single machine. i still wonder how long it will take for a proper cluster of GPUs to virtualize the hash space of a rainbow table into computational power.
$password = hash_hmac(sha256,$_POST['password'],"pound_on_your_keyboard_here");
256 bits of hashed goodness. Goodluck breaking that with a rainbow table.
I've always assumed anything is safe since it's getting hashed anyway before hitting the db, but this got me wondering.