[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: scrypt with "odd" r or N (Re: Canonical way to invoke the KDF?)

On Sat, Nov 16, 2013 at 01:48:27PM -0800, Colin Percival wrote:
> On 11/15/13 18:08, Solar Designer wrote:
> > On Fri, Nov 15, 2013 at 05:34:00PM -0800, Colin Percival wrote:
> >> The scrypt algorithm is specified as operating on arbitrary N; it's just
> >> this particular implementation which is limited in the values it can
> >> handle.
> > 
> > When N is not a power of 2, any definition of Integerify() would result
> > in a non-uniform distribution of the resulting indices.  While this may
> > be acceptable, it let's say does not feel nice.
> Having probabilities which are off by less than 2^-128 doesn't exactly matter.

I agree.  In practice, an optimized implementation would be likely to
process fewer bits of B_{2r-1}, resulting in the probabilities being off
e.g. by up to 2^-64 or even 2^-32, but this is still acceptable.

> > Which do you think is a better way to handle "odd" total memory sizes
> > (if optimal for a certain platform and certain throughput and latency
> > constraints):
> > 
> > 1. Use "odd" values of r.
> > 
> > -OR-
> > 
> > 2. Use "odd" values of N _and_ revise Integerify() accordingly
> > (arbitrary modulo division?)
> No revision necessary: The scrypt algorithm as defined in my paper sets
> Integerify = the 128-bit integer generated by interpreting B_{2r-1} as a
> little-endian integer, and ROMix explicitly reduces this modulo N.
> The "take the bottom n bits of the first word in B_{2r-1}" is just an
> optimization for the case where N = 2^n.

You're right indeed.

OK, no specification revision necessary (only a code revision) - but the
performance impact of modulo division by a non-power-of-2 is still
there.  A relevant question is whether the attacker will incur a
comparable impact too (but measured in area-time).  Perhaps they will,
if we assume that our CPU's implementation of integer division is fairly

Another aspect is that integer division would be an extra variable-time
component, on top of the caches and TLBs.  Given that scrypt is not
constant-time anyway (although its overall running time is close to
being constant), this is probably acceptable.  Are cache/TLB hits/misses
or integer division times more reliable to monitor from another process
or VM on the same system (possibly even running concurrently on the same
CPU core, with SMT)?  I don't have an obviously correct guess on this.
Yet adding integer division to the mix may potentially make some attacks
of this sort easier.

So do we prefer to use N or r to achieve weird total allocation sizes?