开发者

C++ STL list vs set

开发者 https://www.devze.com 2022-12-20 15:19 出处:网络
Which of those two is faster for random insertions and deletions? My guess is list. Though having the values as the keys as in the case of sets is attractive too.

Which of those two is faster for random insertions and deletions?

My guess is list.

Though having the values as the keys as in the case of sets is attractive too.

Is performance similar for iterating over开发者_运维知识库 the whole container?


List

  1. Searching (linear time).
  2. Inserting, deleting, moving (takes constant time).
  3. Elements may be ordered.
  4. Elements may be sorted.
  5. Elements may be duplicate.

Set

  1. Searching (logarithmic in size).
  2. Insert and delete (logarithimic in general).
  3. Elements are un-ordered.
  4. Elements are always sorted from lower to higher.
  5. Elements are unique.


std::list is O(1) for inserts and deletions. But you may well need O(n) to find the insertion or deletion point. std::set is O(log(n)) for inserts and deletions, it is usually implemented as a red-black tree.

Consider the effort to find the insert/delete point to make your choice.


First think about semantics, then about performance.

If you have a set of integers, and you insert to it the integers 6, 8, 13, 8, 20, 6 and 50, you'll end up with a set containing the following five elements: { 6, 8, 13, 20, 50 }.

If you do that with a list, you'll end up with a list containing the following seven elements: { 6, 8, 13, 8, 20, 6, 50 }.

So, what do you want? It doesn't make sense to compare the speed of containers with such different semantics.


In an std::list, the insertion and deletion itself take a time in O(1), which means very fast, and above all means at a speed that does not depend on the number of elements in the list.

In an std::set, the insertion and deletion take time in O(log(N)), which means a bit slower if a lot of elements are contained in the set. The N in the expression O(log(N)) means the number of elements. Grosso modo, it means that the time taken by the operation is kind of proportional to the logarithm (the base does not matter here, since it is equivalent to multiplying by a constant, which is ignored in theoretical algorithm analysis) of the number of elements in the set.

But it is vital to take into account the time taken for finding the element to remove. If you must search the container for the element to remove, which is most probably the case, then the std::list will take quite a long time for this search, which will be in O(N) (which means not fast, because the time is directly proportional to the number of elements, not the logarithm of it), while the std::set will take a time in O(log N) for the search.

Also note that those theoretical analyses become absolutely invalid for containers with very few elements, in which case the multiplying constants they hide become more important than the time function family it focuses on.

To make it short: std::list => Slower for searching the element to delete; faster to delete it. std::set => Faster for searching the element to delete; less fast to delete it.

But for the whole operation, and for big number of elements, the std::set is better.

You should also consider using hash tables. Good implementions of those are available in Boost, Qt, or C++0x. They do all these operations in time tending towards O(1) (which mean very very fast).


You should measure performance yourself with realistic usage on realistic data. Check both typical and worst case performance.

Although std::vector has O(N) time complexity for random insertion, std::set O(log(N)) and std::list O(1), std::vector performs best in many cases. Only if performance is not important enough to spend time measuring, go by the big-O complexity.

"If you're not measuring you're not engineering" (Rico Mariani)


If you care about speed then you should probably use std::vector. std::list performs one heap allocation every time an element is inserted and that's usually a bottleneck.

An exception is when individual items are very expensive to copy, or when you have an awful lot of them. In those cases a list is likely to perform better as it doesn't have to move items around when it's resized. A std::deque is also a good option here, but you'll need to profile your application in order to decide between the two.

Finally, use std::set only if you need to keep your items sorted (or if you don't want repeated items). Otherwise it will be significantly slower than a list or vector.

0

精彩评论

暂无评论...
验证码 换一张
取 消