开发者

Fastest data structure for inserting/sorting

开发者 https://www.devze.com 2023-01-14 02:50 出处:网络
I need a data structure that can insert elements and sort itself as quickly as possible. I will be inserting a lot more than sorting. Deleting is not much of a concern and nethier is space. My specifi

I need a data structure that can insert elements and sort itself as quickly as possible. I will be inserting a lot more than sorting. Deleting is not much of a concern and nethier is space. My specific implementation will additionally store no开发者_运维百科des in an array, so lookup will be O(1), i.e. you don't have to worry about it.


If you're inserting a lot more than sorting, then it may be best to use an unsorted list/vector, and quicksort it when you need it sorted. This keeps inserts very fast. The one1 drawback is that sorting is a comparatively lengthy operation, since it's not amortized over the many inserts. If you depend on relatively constant time, this can be bad.

1 Come to think of it, there's a second drawback. If you underestimate your sort frequency, this could quickly end up being overall slower than a tree or a sorted list. If you sort after every insert, for instance, then the insert+quicksort cycle would be a bad idea.


Just use one of the self-balanced binary search trees, such as red-black tree.


Use any of the Balanced binary trees like AVL trees. It should give O(lg N) time complexity for both of the operations you are looking for.


If you don't need random access into the array, you could use a Heap.

Worst and average time complexity:

  • O(log N) insertion
  • O(1) read largest value
  • O(log N) to remove the largest value

Can be reconfigured to give smallest value instead of largest. By repeatedly removing the largest/smallest value you get a sorted list in O(N log N).


If you can do a lot of inserts before each sort then obviously you should just append the items and sort no sooner than you need to. My favorite is merge sort. That is O(N*Log(N)), is well behaved, and has a minimum of storage manipulation (new, malloc, tree balancing, etc.)

HOWEVER, if the values in the collection are integers and reasonably dense, you can use an O(N) sort, where you just use each value as an index into a big-enough array, and set a boolean TRUE at that index. Then you just scan the whole array and collect the indices that are TRUE.

You say you're storing items in an array where lookup is O(1). Unless you're using a hash table, that suggests your items may be dense integers, so I'm not sure if you even have a problem.

Regardless, memory allocating/deleting is expensive, and you should avoid it by pre-allocating or pooling if you can.


I had some good experience for that kind of task using a Skip List

At least in my case it was about 5 times faster compared to adding everything to a list first and then running a sort over it at the end.

0

精彩评论

暂无评论...
验证码 换一张
取 消