Re: A modification to scrypt to reduce side channel risk

On Thu, Dec 26, 2013 at 4:51 PM, Colin Percival <cperciva@tarsnap.com> wrote:

>> I propose replacing P, the password or passphrase, in step 1 with

>> the null string. In other words all the memory banging in step 2
>> would be determined solely by the salt, S. The password would only
>> affect the algorithm output in step 3, which should be sufficient for security.
>
> i believe this would severely reduce the hw cost, as you can share the
> prepared memory block between multiple cores executing the 3rd step.

Yes. This would severely weaken the algorithm.

I may be misinterpreting what you're saying, but I believe steps 1 and 3 take very little time, and being able to do them in parallel or not makes little difference in the strength of scrypt. Personally, instead of 1 round of SHA-256 in steps 1 and 3, I would use 2048, just to be able to claim scrypt starts as strong as the strongest KDF you can select in OpenSLL for your ssh private key, and just claim scrypt adds security from there. In hardware an attacker can likely compute 2048 SHA-256 rounds in 2 microseconds in about a cent worth of silicon. In the slowest modern _javascript_ implementations, it's still only 2 or 3 milliseconds. Steps 1 and 3 in scrypt are 2000 times faster than this. It's step 2 that we have to get right...

However, you are correct that writing salt-based hashing data to RAM that does not depend on the key lowers security in some ways, but maybe that's OK. Instead of having to fill memory and then read it for each password guess, depending on the salt only for the written data allows the attacker to make a guess twice as fast, since he only has to read memory once per guess. Personally, I'd rather just use 2048 rounds of SHA-256/PBKDF2 to initialize the RNG and use that all over memory, but the FUD-mongers seem to have latched onto some sacred rule about leaking info derived from the password to DRAM, even though we seem to be fine with weak password security in general stored to disk on insecure corporate servers.

The more I try to make improvements to scrypt, the more respect I gain for scrypt's current implementation. As I said, he should just add 2048 rounds of SHA-256 at the start and end to quiet critics, but other than that, it's pretty good now. The main improvement I see possible is to increase the amount of memory we use vs script, to increase the cost to an attacker. Scrypt runtime is dominated by it's hashing function for filling memory and cache misses. We could improve on both of those if the fear mongers let us.

The key is taking into account real hardware. The best DRAM for cracking memory-hard KDFs right now is probably GDDR5. The PlayStation4 has 16 512MB GDDR modules with 176GB of bandwidth, and that monster will beat just about every server around in memory bandwidth. If we use a memory hard KDF that hashes 4 GB with RNG data on our PCs in 1 second (sorry... not blake2, not chacha...), an attacker will have to spend 88 milliseconds simply to read that data from the world's fastest memory (4X slower than the PlayStation4 because it has 1/4 of the memory modules). We could easily double that read time with cache misses on the assumption that high tCAS latency will dominate cheap memory for many years to come. Those chips will cost the attacker, according to the best data I could find today, somewhere from $20 to $30, and I'll just throw out a WAG that the rest of the system in high volume will cost another $20 to $30. It's that cost and runtime that will defeat hardware based password crackers.

There are other funky issues I've run into. For example, to use more memory with the same number of cache misses as scrypt, just increase the read/write data size from 64 bytes to something larger. If you increase it too much, an attacker will just cache the state of your RNG at fixed intervals while generating the RAM data, and instead of having external RAM, they'll just regernerate RAM data from the cached state as needed. Scrypt is secure against this only because 64 bytes is comparable to the RNG state. If scrypt were to write out 16K bytes instead with each read/write, it would not be very secure. The author is a smart guy.