开发者

Best STL data structure to find unordered elements

开发者 https://www.devze.com 2023-01-26 16:41 出处:网络
I\'m currently trying to implement a hash table in C++ for a homework... I\'ve chosen to use internal linking as a solution for collisions in the table...

I'm currently trying to implement a hash table in C++ for a homework...

I've chosen to use internal linking as a solution for collisions in the table...

and I'm looking for a good STL container that will find a specific entry in开发者_高级运维 an unordered set of data.

I can't use an stl container that is based on trees (set, map, trees, etc...)

Right now I'm using a vector, is it a good choice? The search time will be linear, right? Can it be better?


As you're saying I assume the buckets can get big..., it's better to use std::list. Searching is linear in both cases, but adding elements is constant in std::list.

I guess they're all the same, since data isn't ordered - No, they are not. If they were, there would be just one container. Each container has it's own advantages and disadvantages, different containers are used for different situations.

A little information about vector:

  • std::vector has capacity, that's why it has capacity() and size() methods. They're both different. So, suppose the capacity is 4 and you have 2 elements, then size will be 2. So, adding another element will increment the size (will be 3) and it's all very fast.

  • But what happens when you have to add 5+ elements and the capacity is 4? Completely new memory is allocated, all old elements are copied in the new memory, all old elements are destroyed (their destructors are called, if user-defined types). Then the old memory has to be freed. These are expensive operations if you think that adding/removing elements will be more often.
    You can avoid this, using std::vector::reserve method to reserve some memory in advance and not reallocate new memory all the time and copy everything over and over again. But this is useful when you know the approximate size of these vectors. I suppose you don't in your situation( reserving much memory is't a good solution, too - you should not waste memory just like that ) So, again, I'd prefer std::list.

Or double hash.

Anyway, this allocating of new memory and copying of objects will not happen that often, as std::vector is "clever" and when allocate new space, it doesn't increase the capacity with only 1 element or something. I think it doubles it, but I'm not that sure about that. Argh, I don't know how exactly this is called in English.. Probably something like "accumulative time/memory" or "accumulative complexity" :? Don't know :/

NOTE: Whatever you choose, I'd suggest you to pay your attention at the hash-function. It's the most important here. A hash container should NOT have too many elements with the same hash. So, my advice is to search for a good hash-function and then this will not matter that much.

Hope that helped (:


EDIT: I'd recommend you this article - comparing std::vector and std::deque - it's perfect - compares memory usage (allocating, deallocating, growing), CPU usage, etc. I'd recommend the whole site for such articles - there aren't many, but are really well written.


std::tr1::unordered_set could be what you need.

0

精彩评论

暂无评论...
验证码 换一张
取 消