[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Java implementation, couple questions and a possible problem with smix.



Hello,

I have written a compliant java implementation of scrypt you can see here:
http://svn.xwiki.org/svnroot/xwiki/contrib/sandbox/xwiki-crypto/src/main/java/org/xwiki/crypto/passwd/internal/scrypt/Scrypt.java
With tests here:
http://svn.xwiki.org/svnroot/xwiki/contrib/sandbox/xwiki-crypto/src/test/java/org/xwiki/crypto/passwd/scrypt/ScryptTest.java
At this stage no attempt has been made to improve performance, most arrays are byte[].


I found what appears to be a flaw in smix.
It seems that an adversary can substitute processor power for memory cost by replacing smix with the following:

(Floor() returns the largest whole integer less than it's input.)

1:  X <-- B
2:  for i = 0 to N - 1 do
3:    if i Mod 2 == 0
4:      V_(i/2) <-- X
5:    end if
6:    X <-- H(X)
7:  end for
8:  for i = 0 to N - 1 do
9:    j <-- Integerify(X) Mod N
11:   k <-- Floor(j / 2)
10:   if j Mod 2 != 0
12:     X <-- H(X ^ H(V_k))
13:   else
14:     X <-- H(X ^ V_k)
15:   end if
16: end for
16: B' <-- X

In this implementation the largest value of V ever referenced is half of N so memory requirements would
be approximately half. In the default smix implementation, hash H is called 2 * N times.
Assuming that the result of Integerify is an odd number about half of the time, H would be
called an additional N / 2 times. Obviously the memory requirement could be divided by any positive integer,
not just in half as with this pseudo code.

Assuming all processor expense is incurred in the H hash function, we have halved the memory expense at a
processor cost of 1/6th the total cost.

Did I make a mistake here?
If this is a real problem I have a few Ideas on how to solve it and would be happy to work together on it.


A few additional questions:

My understanding is BlockMix is intended only for consuming processor time, why was it written separately
rather than choosing a well known function such as a number of iterations of SHA-512 where it is known that
iterations cannot be skipped?

The reference implementation of Integerify is confusing. The paper makes reference to reading the last machine
length word but the implementation reads 4 bytes from (2r - 1) * 64 which is index 960 (out of 1024) when r = 8.
What was the motivation for this design?

Finally I am curious to know what the point is in interleaving the output from BlockMix. As far as i can tell,
BlockMix is intended only to consume processor power and operates on too small an array of data to constitute
any real memory consumption.


Thanks,

Caleb