@samdk, thank you for your comments. This point has been raised in the discussion above and also in some other forums, where several people have felt that I should be recommending a slower hashing algorithm than SHA-1. It’s a valid point, but I think in practice, the gains are fairly small and there are also some drawbacks that the advocates of these algorithms don’t always consider. Here is some of my reasoning:
I absolutely agree that slower is better when it comes to passwords. But slow does not mean impossible. Take the following example. Say we have a moderate user database of 50,000 records, and that an attacker can perform a SHA-1 hash in 1ms. Taking a dictionary of 1000 common passwords, he can generate a custom rainbow table in one second, compare to the database and probably recover a fair number of passwords unless the application enforced strict complexity requirements. This is assuming that the attacker has access to the application code and the salt, a point which I’ll address in a moment. Alternatively, say we’re using bcrypt with a cost factor that requires 100ms for the attacker to process. Using the same 1000-word dictionary, it will now take the attacker 58 days to run the entire list. More difficult, yes, but not infeasible. And that’s assuming that he has to process the list serially. Parallelizing the operation would make it considerably faster, and this is easy with modern GPUs and services like EC2, exactly as you pointed out.
So the slower algorithms have a benefit, but they’re not going to render an attack impractical. Changing the numbers, of course, changes the outcome. A smaller user database will be run faster, and a larger one will take longer. A dictionary of 500 words might yield almost as many results as a larger one. And it’s worth noting that an attacker doesn’t have to exhaust the database to start using information maliciously. You could give bcrypt a higher cost factor, but you’re going to impact legitimate application performance if you take it very far.
Now to drawbacks. These functions can be complex to implement, for various reasons. Often they rely on libraries installed at the OS level. For example, PHP’s crypt() function did not have an internal implementation prior to v5.3. In earlier versions, it would simply rely on the OS, and if your OS didn’t support Blowfish, PHP would just fall back on DES, which as we know has serious problems. There are a *lot* of servers running versions of PHP older than 5.3. And crypt()-based functions can be especially problematic on Windows, since they’re generally rooted in a *nix library. The MySQL ENCRYPT() function simply doesn’t work on Windows, for example, so portability can be a problem. Web developers are not always in control of the operating systems on which their apps depend, and adding complexity makes people less likely to do anything at all. I’d rather have them use an algorithm that’s broadly supported even if it’s faster.
Finally, as I’ve pointed out before, these functions store the salts and algorithm identifier along with the passwords. That means that unless you *also* use an application-level salt, an attacker who accesses the database has all the information he needs to perform an attack. But a database may be compromised through SQL injection, or from a backup file, or because a management tool was compromised. It is far from given that an attacker who has the database owns the entire machine or even the application code. With a salt that’s stored separately at the application layer, the attacker who has only the database has zero chance of recovering any passwords. Not only does he not have the salt, he doesn’t know its length or how it’s combined with the passwords. Now we certainly shouldn’t be over-reliant on obscurity, but compartmentalizing the salt away from the result, and guarding it like you would guard a symmetric key, at least narrows the type of attack that’s required.
You can certainly do both, of course, and use a global salt along with individual salts—those individual salts will definitely increase the attacker’s time requirement. In the end, though, any hash will be crackable by an attacker who has complete system access, as long as the application allows dictionary words to be used.