While reading the pigeonhole principle on Wikipedia, I come across - "collisions are inevitable in a hash table because the number of possible keys exceeds the number of indices in the array. 开发者_Python百科No hashing algorithm, no matter how clever, can avoid these collisions". But isn't gperf doing this exactly?
Please enlighten.
gperf
creates the hash function and the hash table basing on a predefined list of strings.
It therefore appears to me that gperf
creates the hashes long enough so that there are enough possibilities.
That's what you can do only if you know every possible key upfront - which is an assumption which didn't hold for the description in the wikipedia's entry, which was apparently related to a "non-constant" hash-table.
From the gperf's website: "For a given list of strings, it produces a hash function and hash table,..." - which means it has to know all the strings previously in order to prepare a function, that works without collisions.
Usual hash functions you are using in general programming languages are able to handle any strings as input one after another (the list is not given at once) and therefore they can produce collisions.
精彩评论