Password-based Key Derivation↑

The aim of password-based key-derivation function (PBKDF) is to use a password to create a key in such a way that dictionary attacks (where the attacker just tries a range of possible passwords) are unfeasible.

Passwords or passphrases are often used as a convenient means of deriving cryptographic keys. However, to use a plain passphrase as a key is a security error - passwords are typically far easier to guess than random bitstrings since they come from a predictably non-uniform distribution of possible values (i.e. you can use a dictionary of common passwords to make an attack). For this reason, a password that is to be used as a key should first pass through a PBKDF. Additionally, PBKDFs can be used for storing passwords securely (by just storing the derived key).

To achieve their aim, a PBKDF applies a pseudorandom function (PRF) to the password many times. This means that an attacker making a guess at the password will also have to apply the function many times to his guess to see if it correct. Additionally, the function can be given a “salt” parameter. The idea of this is to make each key derivation operation unique, so that an attacker cannot guess one password and then look for matches against a large number of derived keys.

Parameter Choice↑

The only PBKDF currently approved by NIST is the so-called PBKDF2, standardized in RFC 2898 and PKCS#5.

A developer using PBKDF2 must choose parameter values for the salt, the PRF, and the number of iterations, i.e. the number of times the PRF will be applied to the password when deriving the key.

The specification suggests (in section 4.1) that the salt be (or contain) a 64 bit pseudorandom value. This makes collisions (i.e. occasions that two stored passwords use the same salt) unlikely. By the birthday paradox, we would expect a collision after 2³² passwords, i.e. a little more than 4 billion.

The PRF mentioned in the specification is SHA-1, and in many libraries this is the only choice. However, using SHA-256 or SHA-512 has the benefit of significantly increasing the memory requirements, which increases the cost for an attacker wishing to attack use hardware-based password crackers based on GPUs or ASICs.

The recommended iteration count in the RFC published in September 2000 was 1000. Computing performance has greatly increased since then. The current NIST draft guidelines suggest a minimum of 10000 iterations. Modern guides such as the OWASP password storage cheat sheet (2015) recommend 10000 iterations.

NIST’s previous guide (Appendix A.2.2) recommends that the iteration count be “as high as can be tolerated while still allowing acceptable server performance”. Note that execution speeds of PBKDF2 implementations vary widely. An attacker is of course free to choose the fastest available, so it makes sense to use an efficient implementation on the server to allow higher iteration counts without loss of performance.

PBKDF2 Cracking Benchmarks↑

Imagine we are restricted to using SHA-1 as our PRF, as is the case for example in PKCS#11 up to version v2.20. How long would it take a well-resourced attacker (i.e. with access to GPUs) to break an 8-character password?

First we have to estimate how much entropy or “randomness” there is in an 8-character password. An excellent paper by Kelley et al. from IEEE Security and Privacy 2012 found that when users are forced to choose a password following the “Comprehensive8” policy, “Password must have at least 8 characters including an uppercase and lowercase letter, a symbol, and a digit. It may not contain a dictionary word.”, the result is roughly 33 bits of entropy.

If, however, the password is a perfectly random combination of uppercase and lowercase letters, numbers and the 30 symbols on a US keyboard, we would expect 52 bits of entropy. Interestingly, the same result can be obtained by choosing 4 random words from the Diceware list.

Second, we need to know how fast GPUs can calculate PBKDF2. An article from April 2013 reports a rate of 3 million PBKDF2 guesses per second on a typical GPU setup. This includes calculating AES once for each guess (to see if the right key has been derived to decrypt a master key file), and it’s now November 2015, so suppose conservatively we can apply Moore’s law almost once since then (whether one can apply Moore’s “law” to GPUs is doubtful), giving a very rough rule-of-thumb ability of 5 million guesses per second on typical GPU hardware.

The table below shows how long an attacker would take to cover the whole password space of a single salted hashed password.

Password complexity	Entropy estimate (bits)	1000 iterations	10000 iterations
Comprehensive8	33	4 hours 46 minutes	47 hours
8 random lowercase letters	37	12 hours	5 days
8 random letters	45	123 days	3 years 5 months
8 letters + numbers + punctuation OR 4 random Diceware words	52	325 years	3250 years

Advice summary↑

If you use PBKDF2, you should:

Use an efficient implementation.
Use a unique 64-bit salt for each password.
Rather than SHA-1, use SHA-512 if you can. If not, SHA-256.
Use an iteration count of at least 10000. Use more if you can do it “while still allowing acceptable server performance”.