[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Miscompilation under strict C aliasing rules



On Sun, Feb 27, 2022, at 22:00, Colin Percival wrote:
> Ooh, nasty.  Which version of the scrypt code are you using?  In
> particular, the latest version has a self-test which *should*
> catch issues like this.

The self-test does catch the issue.  I just have not been
running it in my application (nor in its inexistent test
suite).

> We should probably make it operate on uint32_t rather than bytes,
> since the data in question is used as arrays of uint32_t elsewhere?

That might work indeed.  But the loop on bytes is not as bad
as it looks with a recent compiler.  It will end up compiled
unrolled with SIMD instructions:

   0x0000000000062e10 <+112>:   movdqu (%r12),%xmm2
   0x0000000000062e16 <+118>:   movdqu (%rbx),%xmm0
   0x0000000000062e1a <+122>:   pxor   %xmm2,%xmm0
   0x0000000000062e1e <+126>:   movups %xmm0,(%rbx)
   0x0000000000062e21 <+129>:   movdqu 0x10(%rbx),%xmm0
   0x0000000000062e26 <+134>:   movdqu 0x10(%r12),%xmm3
   0x0000000000062e2d <+141>:   pxor   %xmm3,%xmm0
   0x0000000000062e31 <+145>:   movups %xmm0,0x10(%rbx)
   0x0000000000062e35 <+149>:   movdqu 0x20(%rbx),%xmm0
   0x0000000000062e3a <+154>:   movdqu 0x20(%r12),%xmm4
   0x0000000000062e41 <+161>:   pxor   %xmm4,%xmm0
   0x0000000000062e45 <+165>:   movups %xmm0,0x20(%rbx)
   0x0000000000062e49 <+169>:   movdqu 0x30(%rbx),%xmm0
   0x0000000000062e4e <+174>:   movdqu 0x30(%r12),%xmm5
   0x0000000000062e55 <+181>:   pxor   %xmm5,%xmm0
   0x0000000000062e59 <+185>:   movups %xmm0,0x30(%rbx)

One potentially simple improvement would be to have
the buffers 16-byte aligned (and make sure the compiler
understand that it is the case).  Then you would have
movdqa load/stores.