I've carefully read about rainbow tables and can't get one thing. In order to build a hash chain a reduction function is used. It's a function that somehow maps hashes onto passwords. This article says that reduction 开发者_如何学JAVAfunction isn't an inverse of hash, it's just some mapping.
I don't get it - what's the use of a mapping that isn't even an inverse of the hash function? How should such mapping practically work and aid in deducing a password?
A rainbow table is "just" a smart compression method for a big table of precomputed hashes. The idea is that the table can "invert" a hash output if and only if a corresponding input was considered during the table construction.
Each table line ("chain") is a sequence of hash function invocations. The trick is that each input is computed deterministically from the previous output in the chain, so that:
- by storing the starting and ending points of the chain, you "morally" store the complete chain, which you can rebuild at will (this is where a rainbow table can be viewed as a compression method);
- you can start the chain rebuilding from a hash function output.
The reduction function is the glue which turns a hash function output into an appropriate input (for instance a character string which looks like a genuine password, consisting only of printable characters). Its role is mostly to be able to generate possible hash inputs with more or less uniform probability, given random data to work with (and the hash output will be acceptably random). The reduction function needs not have any specific structure, in particular with regards to how the hash function itself works; the reduction function must just allow keeping on building the chain without creating too many spurious collisions.
The reason the reduction function isn't the inverse of a hash is that the true inverse of a hash would not be a function (remember, the actual definition of "function" requires one output for one input).
Hash functions produce strings which are shorter than their corresponding inputs. By the pigeonhole principle, this means that two inputs can have the same output. If arbitrarily long strings can be hashed, an infinite number of strings can have the same output, in fact. Yet a rainbow table generally only keeps one output for each hash - so it can't be a true inverse.
The reduction function most rainbow tables use is "store the shortest string having this hash".
It doesn't matter if what it produces is the password: what you would get would also work as a password, and you could log in with it just as well as with the original password.
精彩评论