开发者

Anybody written a dictionary (hashmap) in ANSI C?

开发者 https://www.devze.com 2023-01-22 15:47 出处:网络
I just wondered if someone could give me some pointers (no pun intended) how to do this? I want to set aside 4GB of ram in order to map numbers to memory which saves me traversing a linked list check

I just wondered if someone could give me some pointers (no pun intended) how to do this?

I want to set aside 4GB of ram in order to map numbers to memory which saves me traversing a linked list checking if they are there.

So instead of having (1,2,3,4,8,34,543,2343) and trave开发者_StackOverflowrsing 8 elements to verify that '2343' is in the list, i want to be able to look up the key '2343' in O(1) time?

Thanks in advance


If you only need to check if the number exists in the list, the you can try to make a Bitmap.

If the numbers are going to be sparsely spread out over a large range like 100,000 values in the range 0-4billion then a Hashtable would be faster. For a C implementation of a Hashtable take a look at GLib's Hashtable.

A Bitmap could hold numbers 0-4,294,967,295 using only 512Mbytes of ram.

#include <stdlib.h>
#include <stdint.h>
#include <stdbool.h>
#include <assert.h>

#define BITMAP_TEST 1

#define BITMAP_32_WORD 1

typedef struct Bitmap Bitmap;
#if BITMAP_32_WORD
#define BITWORD_BITS_SHIFT 5
typedef uint32_t Bitword;
#else
#define BITWORD_BITS_SHIFT 6
typedef uint64_t Bitword;
#endif
#define BITWORD_BITS (sizeof(Bitword) * 8)
#define BITWORD_BITS_MASK (BITWORD_BITS - 1)
#define BITWORD_MULT(bit)  ((bit + (BITWORD_BITS_MASK)) & ~(BITWORD_BITS_MASK))
#define BITWORD_TEST(bword, bit) ((bword >> bit) & 1)

#define BITMAP_WORD_COUNT(bit) (BITWORD_MULT(bit) >> BITWORD_BITS_SHIFT)


struct Bitmap {
    size_t  length;
    Bitword *bitmap;
};

extern Bitmap *bitmap_new(size_t len) {
    Bitmap *bitmap = malloc(sizeof(Bitmap));
    bitmap->length = len;
    bitmap->bitmap = calloc(BITMAP_WORD_COUNT(len),sizeof(Bitword));
    return bitmap;
}

extern void bitmap_free(Bitmap *bitmap) {
    free(bitmap->bitmap);
    free(bitmap);
}

extern void bitmap_set(Bitmap *bitmap, size_t bit) {
    assert(bit < bitmap->length);
    bitmap->bitmap[(bit >> BITWORD_BITS_SHIFT)] |= ((Bitword)1 << (bit & BITWORD_BITS_MASK));
}

extern void bitmap_unset(Bitmap *bitmap, size_t bit) {
    assert(bit < bitmap->length);
    bitmap->bitmap[(bit >> BITWORD_BITS_SHIFT)] &= ~((Bitword)1 << (bit & BITWORD_BITS_MASK));
}

extern bool bitmap_test(Bitmap *bitmap, size_t bit) {
    assert(bit < bitmap->length);
    Bitword bword = bitmap->bitmap[(bit >> BITWORD_BITS_SHIFT)];
    return BITWORD_TEST(bword, (bit & BITWORD_BITS_MASK));
}

#ifdef BITMAP_TEST
#include <stdio.h>

#define MAX_VALUE (2343 + 1)
static const uint32_t test_values[] = { 1,2,3,4,8,34,543,2343 };
#define test_values_len (sizeof(test_values)/sizeof(uint32_t))

static void set_values(Bitmap *bitmap, const uint32_t *values, int len) {
    int i;
    for(i=0; i < len; i++) {
        bitmap_set(bitmap, values[i]);
    }
}

static void unset_values(Bitmap *bitmap, const uint32_t *values, int len) {
    int i;
    for(i=0; i < len; i++) {
        bitmap_unset(bitmap, values[i]);
    }
}

static void check_values(Bitmap *bitmap, const uint32_t *values, int len, bool is_set) {
    int i;
    for(i=0; i < len; i++) {
        assert(bitmap_test(bitmap, values[i]) == is_set);
    }
}

int main(int argc, char *argv[]) {
    Bitmap *bitmap = bitmap_new(MAX_VALUE);

    set_values(bitmap, test_values, test_values_len);

    check_values(bitmap, test_values, test_values_len, true);

    unset_values(bitmap, test_values, test_values_len);

    check_values(bitmap, test_values, test_values_len, false);

    bitmap_free(bitmap);
    return 0;
}

#endif


If the numbers are 32 bits you don't even need hashing, just use an array.


I would advise embedding Lua in your project. Easy to embed and completely ANSI C with one very flexible garbage collected data structure (a Lua table/aka hashmap). You can always strip out the bits that you don't need, but even if you don't Lua is tiny.

Lua has a stack based API which isn't too hard to follow:

  lua_State *L = luaL_newstate();  // make a new lua state
  lua_newtable(L);  // pushes a new table to the top of the stack (position 1)

  // storing values
  lua_pushinteger(2343); // key: 2343
  lua_pushboolean(1);    // value: true
  lua_settable(L, 1);   // pop key/value, store in table at position 1

  // retrieving values
  lua_pushinteger(2343); // key we're looking for
  lua_gettable(L, 1);   // get from table at top of stack - 2; pops key
  if (lua_toboolean(L, -1))  // is it a true value?
  {
    // executes; we know 2343 is true as we pushed it just above
  }
  lua_pop(L, 1);  // pop it off the stack; only our table remains

And you can iterate over the values as well, possibly doing away with the need of your linked list (but the order of the iteration is non-determinate). Full manual here.


A hashtable is actually only O(1) when there are no keys that have the same hash.

For an easy short version of a hashtable in C look here: http://pokristensson.com/strmap.html

0

精彩评论

暂无评论...
验证码 换一张
取 消