开发者

How to create an efficient static hash table?

开发者 https://www.devze.com 2023-03-12 16:20 出处:网络
I need to create small-mid sized static hash tables from it. Typically, those will have 5-100 entries. When the hash table is created, all keys hashes are known up-front (i.e. the keys are already has

I need to create small-mid sized static hash tables from it. Typically, those will have 5-100 entries. When the hash table is created, all keys hashes are known up-front (i.e. the keys are already hashes.) Currently, I create a HashMap, which is I sort the keys so I get O(log n) lookup which 3-5 lookups on average for the sizes I care. Wikipedia claims that a simpl开发者_如何学Goe hash table with chaining will result in 3 lookups on average for a full table, so that's not yet worth the trouble for me (i.e. taking hash%n as the first entry and doing the chaining.) Given that I know all hashes up-front, it seems to be that there should be an easy way to get a fast, static perfect hash -- but I couldn't find a good pointer how. I.e. amortized O(1) access with no (little?) additional overhead. How should I implement such a static table?

Memory usage is important, so the less I need to store, the better.

Edit: Notice that it's fine if I have have to resolve one collision or so manually. I.e. if I could do some chaining which on average has direct access and worst-case 3 indirections for instance, that's fine. It's not that I need a perfect hash.


For c or c++ you can use gperf

GNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only.

GNU gperf is highly customizable. There are options for generating C or C++ code, for emitting switch statements or nested ifs instead of a hash table, and for tuning the algorithm employed by gperf.


Small hashes are also possible in C without an external lib using the pre-processor, for example:

swich (hash_string(*p))
{
case HASH_S16("test"):
    ...
    break;
case HASH_S256("An example with a long text!!!!!!!!!!!!!!!!!"):
    ...
    break;
}

Have a look for the code @ http://www.heeden.nl/statichashc.htm


You can use Sux4j to generate a minimal perfect hash in Java or C++. (I'm not sure you are using Java, but you mentioned HashMap, so I'm assuming.) For C, you can use the cmph library.

0

精彩评论

暂无评论...
验证码 换一张
取 消