[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: crypto_scrypt-sse.c speedup
On 11/15/12 16:09, Solar Designer wrote:
> On Thu, Nov 15, 2012 at 06:47:16AM -0800, Colin Percival wrote:
>> I'm surprised that the "inline"
>> does anything, and it seems odd that the loop unrolling wouldn't happen
>> automatically too.
> I've tried compiling in two ways:
> 1. -march=native -O2 -fomit-frame-pointer
> 2. -march=native -O2 -fomit-frame-pointer -funroll-loops -finline-functions
> The 5% to 10% speedup on Intel is for #1. With #2, I've just measured a
> speedup of 4% on the same E5649.
> I think the compiler does some inlining and unrolling when that is
> requested with optimization flags, but doing it explicitly helps anyway.
> My use of the "inline" keyword is a bit selective, and in the unrolling
> I rely on knowledge that the block size is a multiple of 64. The
> compiler would only be able to make use of such knowledge along with
> inlining or by examining all calls to a non-inlined static function,
> which is trickier, and it'd need to use the fact that "128 * r" is a
> multiple of 64. Perhaps modern compilers are capable of all that, but
> I do see some performance difference even with -funroll-loops
> -finline-functions somehow.
Ah, I can imagine the 128 * r being a multiple of 64 not being anticipated
by the compiler.
> The 30% speedup on AMD Bulldozer is primarily due to the use of XOP bit
> rotate intrinsics, indeed. With only this one change and no other
> changes, the speedup was about 25%.
Sounds like those are worth having... at the expense of needing yet another
compile-time option (or run-time detection, ick).
>>> Please let me know if I should add a copyright statement, although maybe
>>> these changes are too minor to be subject to copyright.
>> Up to you -- either declare your changes to be public domain (and don't add
>> a copyright line in your name) or declare them to be 2-clause BSD licensed
>> (and add a copyright line). Either is fine with me, but I need you to pick
>> one. :-)
> Let's do the latter. I've been placing some of my works in the public
> domain until a few years ago, but some people expressed concern that it
> might not be legally possible in my jurisdiction - so I had to add
> fallbacks to permissive license to my public domain statements anyway.
> So can you add this copyright line if/when you merge the changes? -
> * Copyright 2012 Solar Designer
> or you may use my real name (Alexander Peslyak), or both.
I'll put your legal name in, just in case... my understanding is that pseudonyms
are fine for copyright purposes, but I've heard questions raised about whether
they are universally acceptable for *licensing* purposes.
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid