[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Miscompilation under strict C aliasing rules
On Sun, Feb 27, 2022, at 22:00, Colin Percival wrote:
> Ooh, nasty. Which version of the scrypt code are you using? In
> particular, the latest version has a self-test which *should*
> catch issues like this.
The self-test does catch the issue. I just have not been
running it in my application (nor in its inexistent test
suite).
> We should probably make it operate on uint32_t rather than bytes,
> since the data in question is used as arrays of uint32_t elsewhere?
That might work indeed. But the loop on bytes is not as bad
as it looks with a recent compiler. It will end up compiled
unrolled with SIMD instructions:
0x0000000000062e10 <+112>: movdqu (%r12),%xmm2
0x0000000000062e16 <+118>: movdqu (%rbx),%xmm0
0x0000000000062e1a <+122>: pxor %xmm2,%xmm0
0x0000000000062e1e <+126>: movups %xmm0,(%rbx)
0x0000000000062e21 <+129>: movdqu 0x10(%rbx),%xmm0
0x0000000000062e26 <+134>: movdqu 0x10(%r12),%xmm3
0x0000000000062e2d <+141>: pxor %xmm3,%xmm0
0x0000000000062e31 <+145>: movups %xmm0,0x10(%rbx)
0x0000000000062e35 <+149>: movdqu 0x20(%rbx),%xmm0
0x0000000000062e3a <+154>: movdqu 0x20(%r12),%xmm4
0x0000000000062e41 <+161>: pxor %xmm4,%xmm0
0x0000000000062e45 <+165>: movups %xmm0,0x20(%rbx)
0x0000000000062e49 <+169>: movdqu 0x30(%rbx),%xmm0
0x0000000000062e4e <+174>: movdqu 0x30(%r12),%xmm5
0x0000000000062e55 <+181>: pxor %xmm5,%xmm0
0x0000000000062e59 <+185>: movups %xmm0,0x30(%rbx)
One potentially simple improvement would be to have
the buffers 16-byte aligned (and make sure the compiler
understand that it is the case). Then you would have
movdqa load/stores.