[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallelism in a single instance of scrypt



On Thu, 8 Apr 2010 17:38:30 +0400
Solar Designer <solar@openwall.com> wrote:

> How much parallelism is there in a single instance of the scrypt key
> derivation function?  I notice that you have an implementation for SSE2,
> which means 4x parallelism (right?), but can it be extended further -
> say, to use multiple CPU cores for a single scrypt computation?  What is
> going to happen when x86-64 CPUs with AVX (256-bit vectors) hit the
> market - will it be possible to make optimal use of them?  And chances
> are that future CPUs and GPUs (or hybrids) will have even more cores and
> longer vectors.  If this silicon can't be used optimally for "defensive"
> uses of scrypt (enabling more extensive use of key stretching), then
> that's a major shortcoming of scrypt.  (bcrypt and all others are
> similarly problematic in this respect, which I complained about 10+
> years ago, and I was hoping that the replacement would address this - in
> fact, I did some work on this area at the time, but never finished that.)

The SSE2 instructions are only used within the Salsa20 core function.
Longer vector instructions may make two simultaneous Salsa20
computations faster, but with the current scrypt algorithm, that will
only be useful if p is even.

The scrypt idea could be used with a different mixing function, however.

Robert Ransom