[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: crypto_scrypt-sse.c speedup



On Fri, Nov 16, 2012 at 06:59:19AM +0400, Solar Designer wrote:
> On Fri, Nov 16, 2012 at 04:09:17AM +0400, Solar Designer wrote:
> > I've tried compiling in two ways:
> > 
> > 1. -march=native -O2 -fomit-frame-pointer
> > 2. -march=native -O2 -fomit-frame-pointer -funroll-loops -finline-functions
> > 
> > The 5% to 10% speedup on Intel is for #1.  With #2, I've just measured a
> > speedup of 4% on the same E5649.
> 
> With Salsa20 rounds count reduced from 8 to 2, I am getting a speedup of
> 10% to 15% (varies between invocations) on the E5649 for both #1 and #2.

...and with just 1 round ("break;" inserted right before the "Operate on
"rows"." comment), it's ~15% for #2 and ~20% for #1.

Alexander