[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: crypto_scrypt-sse.c speedup
On Fri, Nov 16, 2012 at 06:59:19AM +0400, Solar Designer wrote:
> On Fri, Nov 16, 2012 at 04:09:17AM +0400, Solar Designer wrote:
> > I've tried compiling in two ways:
> >
> > 1. -march=native -O2 -fomit-frame-pointer
> > 2. -march=native -O2 -fomit-frame-pointer -funroll-loops -finline-functions
> >
> > The 5% to 10% speedup on Intel is for #1. With #2, I've just measured a
> > speedup of 4% on the same E5649.
>
> With Salsa20 rounds count reduced from 8 to 2, I am getting a speedup of
> 10% to 15% (varies between invocations) on the E5649 for both #1 and #2.
...and with just 1 round ("break;" inserted right before the "Operate on
"rows"." comment), it's ~15% for #2 and ~20% for #1.
Alexander