--
Obligatory statement on NEVER USING SHA-1 HASHES to make passwords "safe".
Any normal person can brute force millions of SHA-1 hashes (salted however much you want) per second on a GPU.
If the FBI so wanted (although I don't believe they do) I'm sure they could brute force almost every single password in that database. Granted, it's the government and they have better ways of obtaining such information, but if there is someone the FBI is watching on Instapaper's databases and they so wanted, storing the SHA-1 hash of the password all but handed them over to the FBI.
I am now glad my Instapaper password was generated randomly, 16 characters long, and I will now change it just to be safe.
For anyone running a database which stores ussername/passwords, take a look at bcrypt or scrypt. They're millions (no, I am not exaggerating) of time better than SHA-1.
(Edit: Grammar)
So in this case, where the FBI is involve, using a SHA-1 hash poses no extra security vulnerability.
I imagine that many companies are better prepared to deal with the FBI than this data center was. I have a hard time imagining the FBI going into a Google data center and easily walking out with a few racks. But even if that's too optimistic, I doubt the FBI could go about seizing servers for very long. If nothing else, this would eventually piss off big companies who will lobby Congress to curtail the FBI.
meeeh...
remember the fbi is not a person, it's an organization. the org can have bad actors in it who might be able to access the encrypted passwords but not be able to confiscate servers.
also, confiscating a server(s) is much more visible / detectable...
Modern consumer video cards can do billions per second now. You might as well just store them in plaintext instead of using SHA1/MD5 with or without salting. :/
72^8 >> 1e9
It would still take more than 8 days to brute force at 1 billion/sec. And using a longer password (16 chars?) would make this a very long time.
Or is there other trick that makes this fast? Or, is it simply that people don't choose random, long passwords?
Just use the bcrypt defaults. You will be fine. You will in particular be so much better off than salted SHA-1 that this topic will be mooted. Later on, maybe in 5-10 years, you can re-engage with the debate about what a good cost factor for bcrypt will be in 2020.
Takeaway: Cost to crack one MD5 password: $1. Cost to crack one scrypt password: $50M to $200B.
You want your login to be slow compared to the rest of your application. It's okay to take half a second to verify a login.
You usually add a salt (an additional string which is stored in the clear, but which makes your local instance globally unique, so the attacker can't precompute value to hash mappings ("Rainbow Tables" [which are faster to make if you have alien technology, from what I've heard]) for all sites.
I'd still suggest using bcrypt or scrypt.
1. The next time a user logs in to your system and you verify against the SHA-1 hash that they are who they say they are, recompute the correct hash for bcrypt. Then, delete the SHA-1 hash. It does you no good to have a bcrypt version if you keep the SHA-1 version around.
2. Generate the bcrypt hash from the SHA-1 hash. That is, pretend that the SHA-1 hash is the user's password. This isn't as clean (your password authentication software will then have to do SHA-1 followed by bcrypt) but it means you'll be able to migrate your entire database all at once if you so choose. This also causes a very (very, very) slightly higher chance of password collisions, although there's not much to worry about from that.
(I didn't downmod you).
(Be prepared for your comment score to visit the grey depths if you attempt to relitigate Coda's blog post here and don't know exactly what you're talking about.)
"Any normal person can brute force millions of SHA-1 hashes (salted however much you want) per second on a GPU."
This is not true of bcrypt.
> Just store in plaintext because I am already assuming you are.
No, actually, I don't think I will store plaintext passwords.
> All the this talk about sha-1 vs bcrpyt vs scrpyt is nice and all but I have little faith that most companies care about this as much as HN does.
So what? Just because other people don't do it doesn't mean you don't have to also. Fortunately for us, there are a lot of startup founders here who might read this and learn something.
> I believe that most people are using the default password storage mechanism for their framework which are already known to be easy to break if the database is compromised.
I disagree. I think most people use SHA-1 because they know better than to store plaintext passwords. What they don't know is that it's terribly broken.
> But all of that is mute anyways.
No, it's really not.
> Unless you have access to the site's source how would you know if they are hashing at all much less which one they are using?
There are two problems here. (1) If you have access to the site's password database, there's a really good chance you have access to the entire database, and can look up how they're doing it. (2) Even if you can't lookup how they're doing it, you just try them all and find which one it is. I'd bet you money that if someone's hashing passwords, they're using one of {MD4, MD5, SHA0, SHA1, SHA2, DES}. If, god forbid, they're not using one of those and actually wrote their own hashing algorithm, you have even more to worry about.
> The best practice is to use a random password for each site you use.
For sure, no doubt about it. But what we're talking about here is the best practice for application developers, not the users. The users can't do anything about how their password is stored.
> I just don't see any point in having an rememberable password for websites and hashing just leaves a false sense of security as illustrated by md5.
Or, you know, you could use bcrypt and be secure about it.
Suppose you were using a shared storage space (shared servers, or server farm) with several other dudes. One of them is a drug dealer. One day the police/FBI decide to raid the storage space since the drug dealer has been using it to store illegal drugs.
Is it not reasonable to consider this collateral damage (which, granted, is totally unnecessary) during law enforcement operations?
I'm not saying this is OK in any case, but might this not be a reasonable move by the law enforcement agencies?
If his servers are included in the warrant because they were suspected of housing whatever it is the FBI was after, and the court granted the FBI the right to seize them, then yeah, it's reasonable.
If he was sharing a physical machine with the bad guys, then yeah, sorry, that's collateral damage. However, if he was on his own separate leased machine, there is absolutely no reason for the FBI to seize it. It'd be like them executing a seizure warrant on one of those self-storage spaces, and seizing the contents of all the adjoining compartments (which the person being investigated would have had no access to) just because.
If the police have a warrant for my apartment, and you happen to leave your backpack and server, your stuff will most likely be confiscated, along with mine, if it interests the police.
They probably didn't have anyway to know which machine it was just which rack it was. They also probably didn't have to tell the hosting company directly just the facility that they were raiding.
The problem is that with blade servers like DigitalOne provided, both of these things can be true at the same time.
I don't like to be in the position of defending the FBI (my own personal and professional relationship with them is complicated), but I think the following situation is plausible (which isn't to say it's what happened, as we don't know):
FBI determines the originating IP address of whatever their investigation is targetting (based on published information, it looks like a "scareware" operation").
FBI determines the IP address is "owned" by an overseas hosting provider, and that the physical servers are in a datacenter in the U.S.
FBI obtains a warrant for the seizure of all associated computing equipment (which may very well include the upstream devices used by the hosting provider).
FBI executes warrant at datacenter, sees that the servers are actually blades in a chasis; takes entire chasis (as reconstructing the data later on may require that the servers be bootable.)
The very last forensic case I worked involved having to acquire evidence from a server which was hosting a web application by a hosting provider. This was a shared hosting scenario, so in addition to acquiring the targeted information, all other customers on the server were also effectively offline (as the server was being imaged, and later as the original hard drives were entered as evidence).
Now, obviously, that isn't the exact same situation as what is described here, but in the event that the servers were blades, I don't think it's outside the realm of possibility to think that the entire chasis would need to be retrieved.
If the FBI seized all the computer equipment in the entire building or even just the computers on the same floor as the targeted company but belonging to other companies who just happen to be physically adjacent to the targeted company, would it seem reasonable?
The picture painted here is that the FBI came in and hastily took a bunch of equipment without making sure they were taking the right stuff. If that is accurate, then it's likely they might have missed a server with data on it that they needed for their case. Moving quickly and causing collateral damage in a relatively safe environment where you actually have the time to triple check your work is inexcusable on all fronts.
if you're looking for a metaphor, think about a self-storage facility ([one of these places](http://www.moversandpackers.org/wp-content/uploads/2010/10/s...). imagine you're renting one of those units, and somebody renting a unit on the other side of the yard is a drug dealer. the FBI comes in, and in the process of seizing the assets of the drug dealer across the yard, they also seize all the stuff in your storage unit. There is no way that is reasonable.
I didn’t own the hardware — I was leasing it from DigitalOne.
Why did they take an entire rack, instead of a few servers? I can think of a couple of potential reasons. - VM's, which could potentially live on any physical server in a VM pool - Insufficient information on which physical servers belong to their suspects - They just don't trust the colo operators to not be involved, and thus limit the suspect data to the servers they provide.
While I wholly agree that it's unfortunate that Instapaper and Pinboard were affected, it's not an unexpected consequence of having your servers alongside (or on the same physical machines) of people you don't know.
If you want to say "tough luck that's just what it costs to collect evidence in 2011", fine, but it's probably not fair to say that the FBI should just naturally have that capability.
- Salted SHA-1 hashed passwords for Instapaper
- Encrypted passwords for linked Pinboard accounts (with the encryption key stored in the website code)
- OAuth tokens for linked Facebook/Twitter/Tumblr accounts (and presumably also the secret keys used by Instapaper to use those tokens).
That's (potentially) a lot of personal information.
Note that this isn't simply to keep prying eyes off my data; I live near an overdue earthquake fault line. When it does finally give, I should have a (slightly) better chance of the machine coming through intact.
How much data do Facebook's OAuth tokens contain? By looking at one, can you tell that it's linked to Pavel Lishin's account?
This could work by encrypting your database in a truecrypt volume that must be mounted by entering the password. Thus, the data is only ever saved on disk in encrypted form, and the key to access the data is not saved on the disk. Of course, it is still in principle possible for anyone to access that information if they have physical access to the computer while it's running, but at least this makes that much harder.
It's not feasible to run databases on encrypted block devices. Some databases let you encrypt certain tables or columns, though.
for example, my laptop disk's main partition is encrypted. i need to enter the password when i boot, but nothing terrible happens if i lose power or the system crashes or whatever.
I should note that I'm not disagreeing with you, I just think there are more important considerations to make before physical location of the data.
Furthermore, I'm not sure I'd want to host my data in a country where the police cannot pursue digital criminals.
Also, while I completely understand Instapaper's unwillingness to pursue this through the courts, that is the way our legal system is structured. If you believe you have been harmed in some way by a government action, the courts are the avenue through which you must obtain recourse.
(Not a lawyer, so if I'm wrong about any of the above please correct me.)
Again, though, the question isn't whether they had the right to seize the servers they had warrants for - they did, and you won't get that questioned by any court - but whether they did so properly, and it's not unheard of for a law enforcement agency to get slapped for overstepping their bounds. It's not Common, but it's not unheard of, and it's not a 4th amendment issue either.
I have no idea whether I’ll ever see the server again
In this case the host probably doesn't know better than him. According to the NYTimes they are a swiss company, they only rent space and connectivity from the data center.I see people jump up and down accusing their host being a bad host when their websites go down for 10 minutes. The thing is, shit like this happens all the time. Some years ago even Rackspace was taken offline because a truck hit their data center. Bizarre, right? Yes, but it did happen.
Why haven't there been similar seizures of any larger corporate entities? Even if the current FBI practices are valid, should the application of those practices be a function of size/wealth/power? Which servers of Sony's were seized after distributing rootkits?
In my experience with our leased data center cages, we are expected to fly in to town if we ever need to physically manipulate the servers or even plug things in. The data center employees don't even go into the locked cages.
If the FBI forced open a locked cage, and did stuff in there, I would not expect anything to be addressed until DigitalOne showed up to fix it.
http://twitter.com/instapaper/status/84106275796946944
"As of 2 minutes ago, my DigitalOne server is back online. The logs indicate that it was off and not booted during the time it was missing."
I'm not sure I can deal with the possibility.
His Instapaper account was probably full of stories about Santa Monica.
Seems like Instapaper should change it's private key for, say, Facebook.
When the user's password is verified, it could be used to unlock those tokens and store them in the active session structure in RAM. There'd still be some exposure, particularly in the case of being rooted, but an attacker couldn't just dump the database.
Now I'm somewhat happy having done the extra work. At least the FBI doesn't have my "read later" bookmarks. (Which often consist of the words 'hack', 'malware' and 'reverse engineering'.)
I guess I will reinvent the wheel instead of using cloud services more often in the future.
If they are truly blade servers, then they were possibly sharing the same chassis, power supply and backplane. Could the FBI have pulled just the blades in question? Possibly. But I can very easily imagine the entire blade chassis being viewed as a monolithic component that they would want to be able to perform whatever forensic analysis they are planning. They could also have pulled whatever blades they were not after, and left them, but until you replace the chassis, you are dead in the water.
If you are a voting citizen of the US I recommend you write (not email, write a letter, put postage on it and everything) to your elected congressional representatives and ask that Congress immediately put curbs on the police powers of the FBI when it comes to infrastructure seizures.
it seems to me that, at least, it would make sense to have the db and web server physically separate in that case (although i guess someone stealing hardware is not normally a common scenario).
Why worry about this?