开发者

Hashed Array Tips in C Language

开发者 https://www.devze.com 2023-02-28 05:25 出处:网络
I need some ideas to develop a good hashing function for my assignment. I have a list of all the countries in the world (around 190) in total. The names of each country is开发者_如何学运维 the key for

I need some ideas to develop a good hashing function for my assignment. I have a list of all the countries in the world (around 190) in total. The names of each country is开发者_如何学运维 the key for the hashing function. Is there a specific kind of hashing function anyone would recommend to store this data in a hashing function without many collisions? Also, can you perhaps give an example of how to implement it?


Use GNU gperf. For inputs like yours, it will generate C code for you which implements a perfect hash function (for the given inputs). No collisions, no worries.


You can use generated perfect hash for that (GNU perf).

Of if the set of strings is dynamic then you can use ternary trie. For N unique strings it will give you unique number [1..N]. For your case it will be faster than with hash tables. Here is my implementation of such thing: http://code.google.com/p/tiscript/source/browse/trunk/tool/tl_ternary_tree.h


The simplest approach I can think of is for each country's name to compute the sum of the ASCII values in its representation and use this as the hash value:

int hash(const char *s)
{
    int h = 0;

    while (s && *s)
          h += *s++;

    return h;
}

If your hash map has size N, you store country names with map[hash(my_country) % N] = my_country. Conceptually.

Just try this approach and see whether the resulting hash values are sufficiently uniformly distributed. Note that the quality of the distribution may also depend on N.

0

精彩评论

暂无评论...
验证码 换一张
取 消