Comments on Web Cryptography: Salted Hash and Other Tasty Dishes

30 Reader Comments

Back to the Article
  1. Your remarks on hashing passwords are incomplete.

    First, you should use a different salt value for each user. If you use the same salt each time, then an attacker can discover users who have the same password, which probably means that it’s a common one and they should concentrate on cracking that one.

    Second, you have to assume that an attacker who has compromised your system and has got your user database has also got your salt values, so they can use them to try a dictionary attack against your users’ passwords. To help to prevent this, you should use a hashing algorithm that has been deliberately designed to be slow and inefficient, such as bcrypt.

    Copy & paste the code below to embed this comment.
  2. I’m no expert, but doesn’t “This means that multiple inputs could theoretically produce the same result…” contradict the idea of collision resistance?

    Copy & paste the code below to embed this comment.
  3. No. Collision-resistance merely means that there’s no way of producing a collision that is simpler than just trying different inputs until by accident two hash to the same value.

    Since the output, the hash, is generally shorter than the input. (typically the input can be any length while the output is fixed-length), there *must* be collisions.

    Consider it with a theorethical (too small!) hash that is fixed-length at one hex-character.

    There’s only 16 possible hash-values, so you MUST get a collision if you hash 17 different strings, and likely you’ll get one much sooner.

    The same thing applies if the hash is larger, it’s just that you need to hash more stuff before you get a collision - and if the hash is large enough, then “more stuff” becomes “unreasonably much stuff”

    For example, if the hash is 128 bit, and truly collision-resistant, then you’d expect to have to try to hash aproximately 2^64 different strings before you’d get the first collision. (on the average, offcourse you could get lucky and get a collision after the first 2)

    Since 2^64 is equivalent to hashing a billion strings every second for slightly more than 200.000 years, this is unlikely to happen in practice. (and by the time computers are fast enough that it could be provoked, we’d need a bigger hash)

    Copy & paste the code below to embed this comment.
  4. It’s great that ALA decided to publish an article about security and password storage, because generally the awareness among web developers is rather low.

    However, the advice about salting is a little short and doesn’t really explain in-depth how to go about storing the salt. In that, I agree with commenter #1. But the real problem with this article is that it’s giving bad advice! There “have been”: “several”: “incidents”: in “recent news”: of sites that were using simple hashing mechanisms to store passwords where people stole the databases and retrieved a lot of passwords.

    Nowadays hashes can be calculated at amazing speeds and there are “tools that make this very easy”: to do. Combined with a dictionary-based attack, this allows attackers to retrieve a password rather quickly. There are “standardised functions for safely storing passwords like the UNIX crpyt() function”: that every web developer should know about.

    Why use crypt() and not roll your own? Well, crypt() has many advantages:

    * It is easy to make mistakes in rolling your own. Smart people have put a lot of thought into these functions. and they are more likely to be secure than whatever you might come up with, unless you’re an experienced security researcher.
    * Crypt() bindings are available in almost every programming language. This makes it a portable way to store passwords, and will allow you to easily share your database with programs written in different languages.
    * The crypt() format is forwards-compatible; you can upgrade to a stronger mechanism transparently and passwords encrypted with an older encryption algorithm continue to work. As people log in, you can rehash their password using the new mechanism.

    Copy & paste the code below to embed this comment.
  5. Lyle, thanks for your reply.

    The original DES implementation of crypt() is weak and broken indeed and it needs to be stressed that people shouldn’t rely on that algorithm anymore. However, all modern systems implement either OpenBSD’s blowfish-based algorithm or RedHat/Ulrich Drepper’s SHA-2-based algorithms, both of which are quite strong.

    I think avoiding crypt() just because an older implementation is broken is unneccessary. In fact, exactly because algorithms become weaker as time passes it’s a *good* idea to use crypt(), because of its modular design. Just upgrade the system’s crypt and voila, *all* your applications suddenly have stronger hashing capabilities, with complete backward compatibility for existing passwords.

    You mention that these password crackers are specifically for crypt() output. Indeed, John the Ripper is by default targeted at OS-specific algorithms, but it can be “easily modified to target custom salted SHA-1 hashes”:

    The above link shows that this person was able to get 2,900 guesses per second from their modified JtR on a plain salted SHA1.

    So, yes, JtR and other crackers are aimed specifically at standard operating system algorithms, but that just proves my point: these standard algorithms are constantly under close scrutiny. This means that their secure status is well-known and you can safely rely on using them. When they are broken it will be pretty widely known and new algorithms will be replaced, which you can easily upgrade to because modern crypt() is inherently modular.

    Upgrading your own system will be much harder unless you’ve given this serious thought. And once you’ve given it serious thought, you’ll probably end up with a similar design, so why not save you the hassle and just use crypt()?

    Finally, you advocate storing a *global* salt value, but that’s dangerous advice. If the database can be stolen, it’s very likely that your configuration files are compromised as well, and any global salt can easily be extracted from the configuration files.

    In fact, if you were to use *only* a global hash value, you can immediately see from your database when two people are using the same password (because the same salt is concatenated to the same password this will yield an identical hash value). Generating a custom “Rainbow Table” given a known salt is also pretty easily done, and the passwords in your database can be cracked in parallel. If the salt is custom per hash, like with crypt(), each password *must* be cracked separately.

    Copy & paste the code below to embed this comment.
  6. “I think we can all agree that hashing is better than not hashing, and salted hashing is better than straight hashing.”

    Definitely agreed. That said, if the salt is unknown as the password is, that also makes hacking significantly more difficult. For example, simply salting a password with a user’s name is a fairly standard protocol, even if it’s not the best; at least it makes the salt unique per user.

    A lot of people advocate using random salts, which generally fall into 2 categories. First is those that are completely random, and thus must be stored along with the password. This approach is very difficult to beat, provided the method of salting isn’t known to the database, but only to the application (for example, if the attacker sees the salt in the database, he will probably try something like simply concatenating the random salt and the password first).

    Another option is to use a salt generated in such a way that it’s unique per user, but isn’t the user name, and isn’t stored in the database. One example I’ve seen involves creating a random variable seeded based on something (like, say, the length of the user name) that always returns the same value, and concatenating that with the user name.

    All of these, though, are predicated on the hacker not being able to access the application code. If he can, then it doesn’t matter how you hash and salt.

    Good article. I think a lot of people will find it useful.

    Copy & paste the code below to embed this comment.
  7. Lyle,

    I agree that storing a separate global salt is a good additional layer of security if you’re concerned about SQL injection *or* when the database server is reachable through the internet, since those two scenarios are about the only ways someone could get to the database without also having access to the webserver’s files.

    You mention that if this happens, we should slow down the attacker as much as possible. This ties in to the very first comment here by drplokta; systems like crypt() (or bcrypt as drplotka mentions) don’t just *salt* the passwords, they also *stretch* them. This effectively lengthens the password in such a way that cracking one password takes longer. You can do this through applying the hashing function for thousands of iterations or more. This could reduce the JtR example above from 2,300 attempts per second to *several seconds per attempt*, if you want! I’m sure you agree this is preferable to hashing just once.

    In fact, that’s what crypt() does. The modern implementations I mentioned in fact even allow you to tweak how many iterations it should use. This allows a system administrator to compensate for Moore’s law; as CPUs get faster, the administrator can just tune the number of iterations. The cool thing is that the modular syntax of crypt() hashes allow you to do this also without having to rehash existing passwords. This allows you to implement your own policy for expiring passwords hashed with older algorithms and upgrade to a more secure system at your own pace.

    Copy & paste the code below to embed this comment.
  8. Lyle,

    Yep, that’s a classic tradeoff and everyone will make it differently, since different apps or businesses have different needs. The main point is that people should be *aware* that you can make this tradeoff, and that the software should have the accommodations to allow the user to make it.

    Copy & paste the code below to embed this comment.
  9. To get rid of all those rainbow tables.

    I just use a some functions

    I talked with some guys from Deloitte - they got me thinking a lot. It has nothing to do with the length of the password, but how many times you may try to login. Fx creditcard codes are normally on 4 numbers, but you only have 3 tries.

    So what I did was to make my login function show a captcha after 3 tries, and after 5 tries it will lock the username, and report multiple attacks to an administrator email.

    Usernames are twoway encrypted in my database, with a random function, where you will need the source code to get the username.

    Copy & paste the code below to embed this comment.
  10. @lsv20, Lyle is right; the techniques under discussion are about stopping attackers from obtaining passwords *off-line* from a compromised database containing the hashes, not about *on-line* attempts to bruteforce a successful login.

    Also, your approach, if PASSWORD is just the user’s plain password, is *just* as vulnerable as unsalted single-function hashing wrt rainbow tables: one only has to generate all hashes once for your set of functions. Rainbow tables are about *precomputation*. Your set of functions only make that precomputation a bit more expensive, but it still only needs to be done *once* for all deployments of your application or library if you don’t salt your hashes.

    Even if your password *is* salted, your homebrew scheme doesn’t buy you much additional security; the crypt()-based methods I mentioned run blowfish or SHA-2 (both more expensive than SHA-1) for literally *thousands* of iterations.  Your approach is computationally about as expensive as running SHA-1 for 4, maybe 5 iterations…

    Copy & paste the code below to embed this comment.
  11. Hi Lyle,

    The only thing I can see the article is missing is…Where can one go to learn more about how to implement these suggestions?

    Thanks, Jim

    Copy & paste the code below to embed this comment.
  12. Thank Lyle for the tips!

    Copy & paste the code below to embed this comment.
  13. Usually I like ALA articles a lot, but I felt the need to register to post a comment on this. The discussion of hashing here is dangerously incomplete.

    Most cryptographic hash methods are meant to be fast. That’s a good thing for most applications, but it’s terrible for password hashing, because people can generate rainbow tables relatively quickly. (_Especially_ now that Amazon EC2 Cluster GPU instances are available.)

    Using standard hash functions like SHA1 is bad security, and telling people that these hash functions are “good enough” is dangerous. If you want your passwords to be secure, you need to be using *bcrypt* or *scrypt*. They are specifically designed to be slow, and they do the work of salting for you.

    “Thomas Ptacek’s article”: is still the best I’ve seen on password hashing. Please read it before giving people advice on secure password storage. If you are using SHA1, and your database is compromised, your users’ passwords are not secure.

    Copy & paste the code below to embed this comment.
  14. Thank you for writing on this subject. This is interesting I have not thought about it from your point of view.I have been looking for something like this and your blog helps me a lot to understand the topic better. Waiting for your next post.

    Copy & paste the code below to embed this comment.
  15. Since the output, the hash, is generally shorter than the input. (typically the input can be any length while the output is fixed-length), there must be collisions.

    Copy & paste the code below to embed this comment.
  16. Your time estimates are off by orders of magnitude. GPUs can calculate hundreds of millions of SHA1 hashes in a second (, and powerful GPUs are available to anyone who wants them with an EC2 cluster GPU instance. That is, your 50,000,000 SHA1 iterations can be run in under a second. (Even a regular CPU core will take only a few seconds.)

    Certainly bcrypt/scrypt are not perfect, but giving your users days/weeks/months to change their passwords is a lot better than no time at all. If you can’t use bcrypt or scrypt for technical or other reasons, then running many (1000+) iterations of a hash function is another alternative.

    But really, the problem I have is not with the discussion here in the comments: it’s that none of this is discussed in the article, which is all most people are going to read.

    Copy & paste the code below to embed this comment.
  17. Yes, I do agree that this approach makes the attacker’s job harder, but as long as he has access to the entire application and all salts, it’s always a tractable problem. There’s also tension between security and usability. If you stretch the time required to perform the operation too much, then you impact the user experience at signup and login. You also increase the demand on server resources at scale, when many people are attempting to sign up or log in at the same time. On a small scale it’s less of an issue.

    Copy & paste the code below to embed this comment.
  18. One of the problems I’m dealing with right now is that, while lots of authentication systems use SHA-1 out of the box, they all use a different default salt algorithm.  That makes it really hard to merge accounts or switch from one system to another without loosing everyone’s password.

    I’m not sure it’s true that “The SHA-1 algorithm… is currently the best available hash function that enjoys broad support in most popular programming languages.” To my knowledge, bcrypt is at least as secure,* and there are implementations in all the major programming languages.  Our Ubuntu (Debian) Linux servers have it pre-installed.

    But the real win with bcrypt is that the salt is stored with the hash in a standardized format.  Bcrypt generates a different hash every time it’s called for the same password, though any one of those hashes can validate that password.  So you can move your accounts to and from any bcrypt-capable system without worrying about how OpenLDAP (for instance) decided to salt the password.

    What it comes down to is, encryption is best left to the experts.  Deciding how to salt your hashes isn’t something web developers should have to think about.  SHA-1 doesn’t do that for you, bcrypt does.

    *I’m not cryptographer, but bcrypt has been around for a while, and I don’t know of any security holes.  It’s also newer than SHA-1, and has a number of properties designed to make it harder to crack.  In particular, you can set the minimum amount of processing time required to verify a hash, so that it’s plenty fast for authentication, but expensive to brute force attack.

    Copy & paste the code below to embed this comment.
  19. This was an excellently written article, and covers a concern that I’ve tried time and again to communicate to other people. Hashing the raw password is not enough, and an additional check digit or check character is always a great addition as well. That way, the password is modified, the hash is concatenated to a fixed length with the position based on the user ID (not user name), and a check digit is added.

    Copy & paste the code below to embed this comment.
  20. Be careful to limit your expectations of what a salted hash actually does!

    It is fairly simple to extend a theoretical attack to prepend the salt onto a word to test, a small ammount of setup time is insignificant.  A salt only prevents a precomuted (rainbow) table being used, it doesn’t make it any more time consuming for an attacker to use a keyspace or dictionary attack on the cyphertext.

    On a 8600 GPU (fairly outdated) the search rate for SHA1 hashes is a little over 46M/s, varying slightly with the number of hashes to check against. A linux box with several more powerful dedicated GPU could possibly chew through a bullion a second. Most salting systems used in PHP applications offer little to no protection against these sort of attacks, sheer speed is enough to reverse most passwords.

    The term ‘custom rainbow table’ is an oxymoron, the point of a rainbow table is that it can be reused without additional calculation time. Generating tables would be a waste of time and power, a standard attack (finishing when a possible hash is found) would be far more efficient.

    I’ve always been interested to know how Facebook stores password fingerprints, has it been published anywhere?

    Copy & paste the code below to embed this comment.
  21. The fact that most users re-use passwords across multiple sites is the problem, as pointed out in the article.  But therefore what benefit is it if my site uses military level encryption if another site just stores plain passwords?

    My site’s security for that user (and probably most users) is still just as good as that plain password. It’s not a huge comfort to me that I was hacked because of someone else’s lax security.

    Hashing / encryption always sends us off into getting excited about the number of bits, or complexity or whatever - but distracts from all the human aspects where the real security risks are.

    Copy & paste the code below to embed this comment.
  22. Sorry, commenting is closed on this article.