开发者

Cryptographically secure additive hash function

开发者 https://www.devze.com 2023-01-21 17:58 出处:网络
I am working on a Fountain Code based file transfer system. In this system blocks of data are downloaded, combined with an xor function. I want to verify the blocks as they arrive.

I am working on a Fountain Code based file transfer system. In this system blocks of data are downloaded, combined with an xor function. I want to verify the blocks as they arrive.

开发者_如何学运维What I need is a cryptographically secure hash function which has the property:

Hash(A) ^ Hash(B) == Hash(A ^ B)

does such a thing exist?

Note: The data blocks must be combined with an xor function, the hashes can be combined with any function you like, so long as it's reasonably cheap to compute.


If the identity you are requesting is exactly

Hash(A) ^ Hash(B) == Hash(A ^ B)

then no, no such cryptographically secure hash function is possible. This is because your function would then be a linear map (over the field with two elements) from the space of possible blocks to the space of possible hashes.

What does this mean, in simpler terms?

Well, suppose your map takes blocks of length 6 and returns hashes of length 3, and that these are some of the hashes:

Hash(000001) = 010
Hash(000010) = 111
Hash(000100) = 001
Hash(001000) = 101
Hash(010000) = 110
Hash(100000) = 001

Then you can compute the hash of any given block by linear combinations of the above. For example,

Hash(101000) = Hash(100000) ^ Hash(001000) = 001 ^ 101 = 100.

This means that your hash function can be represented by a 6-by-3 matrix.

What does this imply?

Wikipedia defines the ideal cryptographic hash function as having four main or significant properties:

  • it is easy to compute the hash value for any given message,
  • it is infeasible to find a message that has a given hash,
  • it is infeasible to modify a message without changing its hash,
  • it is infeasible to find two different messages with the same hash.

The first property may be true, of course, but the rest will not be. Inverting the hash function is as simple as solving a system of linear equations, which is easy. I assume you've done this for linear maps over the real numbers, but the exact same approach works here.

If you find an element of the kernel of the hash function, that is, a message K such that Hash(K) is all zeros, then the last property fails also. Take any message M; then M and M^K will have the same hash, because Hash(M^K) = Hash(M)^Hash(K) = Hash(M)^0 = Hash(M). Finding elements of the kernel is easy, also.

The third property is a little more difficult, but can be broken also. (For example, suppose you're hashing a legal contract. Find a few places where some stray commas can be modified or something. Consider the effect that these changes have on the hash function, and then solve a system of linear equations.)


What you want is called a Homomorphic Hash. I'm not up with the latest developments, but the one I saw described is extremely - almost unfeasibly - slow to compute. The original paper is here, and a followup with some refinements to its use is here.

As for combining blocks, the hash typically requires you use addition in a prime field. If you're using fountain codes, you don't have to use xor, though - any reversible function is fine, and that includes addition. The hash described above works on addition and multiplication in a prime field, and is provably secure.

0

精彩评论

暂无评论...
验证码 换一张
取 消