[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: identical input, different output?

On Fri, Dec 05, 2014 at 03:25:33PM -0800, Robert Ransom wrote:
> I believe that *every* cryptographic function needs a run-time self
> test, and that the self-test code and data must be in a separately
> compiled source file to defend against moderately broken/malicious
> compilers.

Oh.  It's never clear how/where to limit one's paranoia, but adding a
runtime self-test to every invocation of every crypto primitive,
including fast ones, would (roughly) halve their performance.  Many
users would prefer an alternative that lacks the self-test then.  Run
the self-test on the first pass through the code only?  Possible, but
adds thread-safety bugs or concerns (are there bugs? what memory
coherence model is assumed?) or overhead and complexity (mutexes).

Password hashing and KDFs are in a rather unique position that we can
actually afford a runtime self-test with good code coverage (but not so
good "memory coverage") on every invocation while incurring only
negligible overhead.

> The way to detect lack of SSE/SSE2 support is to use the CPUID
> instruction

Oh, of course I should have listed that as an option too.

To me, simply crashing on an SSE2 instruction in code built for SSE2
feels better.  The code might (or might not) crash before reaching our
check anyway: as soon as we enable e.g. "gcc -msse2", the compiler
itself may generate SSE2 instructions, including those with no MMX
counterparts.  This is why the full scrypt file encryption program did
crash for me on Pentium 3, but the scrypt KDF from it did not (when put
into another program).

When using intrinsics, CPUID is a safer bet against compiler
optimizations, but we'd have to use #ifdef's to choose the intrinsic
that the current compiler supports (and what if know no CPUID intrinsic
for the current compiler?)  If we resort to inline asm, we can as well
put a suitable SSE2 instruction in there (to trigger crash on pre-SSE2),
which is simpler (just one instruction), safer (no dependency on the
caller's return value check), and more consistent (the code might have
SSE2 instructions before that point, depending on compiler and other
parts of the program, so a belated CPUID check feels a bit silly).