[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: crypto_scrypt-sse.c speedup



On Fri, Nov 16, 2012 at 02:21:14AM -0800, Colin Percival wrote:
> To be honest, I didn't spend a huge amount of time optimizing this code...

That's fine.  Your code is far more optimal and cleaner than most other
code out there, and we can optimize it further now. :-)  Unfortunately,
some optimizations make it less readable (although some others make it
more readable), but that's why you also have a reference implementation.
So I think we're OK making the -sse source file slightly less readable.

> On 11/15/12 20:50, Solar Designer wrote:
> > I think having X as a local variable lets the compiler fully keep it in
> > registers, whereas having it passed into the function by reference may
> > result in unnecessary writes into the provided X array before the
> > function returns; it may also encourage the compiler to do such writes
> > inside the loop, especially since its iteration count is determined by r
> > and thus is not known at compile time (might be low).
> 
> Makes sense.

I ended up replacing the X array with two pointers, X and Y, which point
to Bin and Bout array elements.  This avoids having to save a copy of X.
(In your original code this was done with a blkcpy().  In my older
revision of the code, it was an extra assignment in salsa20_8_xor().
Now neither is needed.)

Alexander