Proper password and session management have been made accessible for many years if you use any "modern" backend.
Something I once saw in production:
An otherwise normal usage of bcrypt, that "prehashed" the password with sha512 before passing the resulting hash to bcrypt. This is conceptually reasonable (although not very useful) for reasons I don't think are worth going into, but had two critical problems.
1. It technically broke bcrypt's side-channel protection against timing attacks (bcrypt runs in constant time, while sha512 does not).
2. The much worse issue was that the binary hash was being passed to bcrypt. If bcrypt encounters a null byte in the input, it considers that to be the end of the input and ignores everything after, c-string style. Every byte of the binary hash had a 1/256 chance of being a null byte, which means 1/256 passwords result in a hash with the first byte being null, which means bcrypt will accept any of them as equivalent to the others. You could login to 0.4% of accounts by guessing ~256 completely random passwords (and a larger number of passwords were similarly but less extremely weakened)
You can say "well that's their fault for mucking with the input to bcrypt" but I bet most people would believe that kind of operation was safe before it's explained to them why it's not. It's very easy to accidentally destroy the security of a crypto scheme without knowing it, and thinking that what you're doing is perfectly run of the mill and safe.
And then I'm reminded of the bug that was hidden in plain sight for 20 years in Programming Pearls[1] in their implementation of a binary search. This same algorithm was copied in Java and elsewhere, near verbatim. So, yeah, maybe don't roll your own searching either. The experts have enough difficulty.
Frankly, I don't trust most programmers to get anything right. People toss around "battle-tested" libraries and we all just take it for granted that that's what they are. But I recently found out one incredibly popular package for one incredibly popular language is doing something fairly dumb. Not insecure (that I can tell). Just simply unnecessary. They are signing session cookies. For reasons I could not fathom. Signing doesn't prevent cookie theft. It's just another layer that someone thought would add security over just sending a random string (hopefully your system/language's random generator is actually random). And that's how a lot of these mistakes happen. Let's add sha512 on top of our crypto. Because more = better, right?
Speaking of random, there was another popular package out there for generating UUIDs that has a github issue where someone discovered the package was generating collisions. Yikes.
[1] https://ai.googleblog.com/2006/06/extra-extra-read-all-about...
However, lest anyone get the wrong opinion, password hashing is a very simple, straightforward, solved problem and has been for years.
One person doing something monumentally stupid doesn't make Auth some kind of cryptic minefield.
I would wager that most people reading this, if earnestly asked by a coworker "do you think it's fine for me to hash a password before passing it to bcrypt? I want to be able to support passwords over 72 characters and bcrypt truncates its input." would answer something along the lines of "I don't see how it could hurt" rather than "that's dangerous because a binary hash would result in a large portion of the passwords being hashed as an empty string"
The engineers that originally implemented and reviewed this were not idiots, they just weren't security experts.
Try asking the following question on Twitter:
"While passing binary data, which example is safer for storing passwords?
[ ] $password | sha256 | bcrypt
[ ] $password | bcrypt"
What do you think the average programmer would say? Most would probably say they are either equal, or the first one, without knowing this specific thing, because most people don't implement their own password-hashing, they use library/framework provided ways that has been established as best practice already.
But, can't blame them really, the difference is marginal and innocent on the surface, but once you understand the implementation, you'll see the holes.
The yearly report of leaks in Fortune 500 companies should be proof enough of this.
EDIT: To elaborate. Crypto scheme is only one tiny facet of a successful authentication solution. Where do you store the hash? What language and stack are you using? What is the maturity of libraries available to you? What protocols? And many more seemingly tiny decisions. All it takes is a lazy developer that imports an insecure transient dependency or snoozes on a CVE.
What makes you think this foolish developer would not do the exact same thing before sending the data to an externally hosted auth service?
https://dropbox.tech/security/how-dropbox-securely-stores-yo...
> First, the plaintext password is transformed into a hash value using SHA512. This addresses two particular issues with bcrypt. Some implementations of bcrypt truncate the input to 72 bytes, which reduces the entropy of the passwords. Other implementations don’t truncate the input and are therefore vulnerable to DoS attacks because they allow the input of arbitrarily long passwords. By applying SHA, we can quickly convert really long passwords into a fixed length 512 bit value, solving both problems.
> Next, this SHA512 hash is hashed again using bcrypt with a cost of 10, and a unique, per-user salt. Unlike cryptographic hash functions like SHA, bcrypt is designed to be slow and hard to speed up via custom hardware and GPUs. A work factor of 10 translates into roughly 100ms for all these steps on our servers.
> Finally, the resulting bcrypt hash is encrypted with AES256 using a secret key (common to all hashes) that we refer to as a pepper. The pepper is a defense in depth measure. The pepper value is stored separately in a manner that makes it difficult to discover by an attacker (i.e. not in a database table). As a result, if only the password storage is compromised, the password hashes are encrypted and of no use to an attacker.
If you are considering the string copy that happens when the unhashed password is copied into the block to be variable time, probably you shouldn't count that against SHA-512 because there will be variable-length string operations before bcrypt starts its hashing too. Not to mention parsing it out of HTTP parameters or reading it from a GUI control. (If you really care about timing observations, perhaps you'd pre-hash the password on the client so what is transmitted is fixed-length).
Point 2 is of course a real problem. Though to be honest I've always regarded the NUL check and (sometimes, implementation-dependent) maximum length as unnecessary, arbitrary flaws in bcrypt - so bcrypt gets the blame. On grounds of purity and doing the right thing, I refuse to use raw bcrypt, and in fact I use: SHA-512 -> Base64 -> bcrypt when password hashing.
Does that make me wrong? Only if Dropbox is wrong (I hope they just forgot to mention Base64 in their old security article.)
I'm aware of this, and it's why there's a "technically" in my original point. Of course in practice it makes no difference, and I consider the better use of bcrypt's input space when prehashing to outweigh leaking a pretty useless amount of side-channel information for extraordinarily long passwords.
Though as a follow up anecdote, one of the engineers there had set a 300kB password to test the prehashing implementation, and just continued to use it (via password manager). He was potentially leaking a bit of timing information. I'm not even sure if a password that long made a timing difference that was detectable through network jitter, but if it did and some hypothetical attacker was able to learn that his password was hundreds of kilobytes long, well... good luck.
> in fact I use: SHA-512 -> Base64 -> bcrypt when password hashing. Does that make me wrong?
No, using a base64 representation of the sha hash makes your scheme perfectly reasonable. Also note that the base64 representation of sha512 is a bit longer than the typical bcrypt input length, so you're shrinking the collision space somewhat by truncating your sha hash. Certainly not enough that it would matter at all, just a point I find interesting. Hypothetically, a binary representation with a pass that removed or replaced NUL bytes would preserve more entropy, but we're well past the point where it matters and would be playing with fire to consider that complexity.
Good luck to you!