Merging k sorted linked lists - analysis_问答_开发者

Merging k sorted linked lists - analysis

开发者 https://www.devze.com 2022-12-27 20:43 出处：网络

I am thinking about different solutions for one problem. Assume we have K sorted linked lists and we are merging them into one. All these lists together have N elements.

The well known solution is to use priority queue and pop / push first elements from every lists and I can understand why it takes O(N log K) time.

But let's take a look at another approach. Suppose we have some MERGE_LISTS(LIST1, LIST2) procedure, that merges two sorted lists and it would take O(T1 + T2) time, where T1 and T2 stand for LIST1 and LIST2 sizes.

What we do now gener开发者_如何学Pythonally means pairing these lists and merging them pair-by-pair (if the number is odd, last list, for example, could be ignored at first steps). This generally means we have to make the following "tree" of merge operations:

N1, N2, N3... stand for LIST1, LIST2, LIST3 sizes

O(N1 + N2) + O(N3 + N4) + O(N5 + N6) + ...
O(N1 + N2 + N3 + N4) + O(N5 + N6 + N7 + N8) + ...
O(N1 + N2 + N3 + N4 + .... + NK)

It looks obvious that there will be log(K) of these rows, each of them implementing O(N) operations, so time for MERGE(LIST1, LIST2, ... , LISTK) operation would actually equal O(N log K).

My friend told me (two days ago) it would take O(K N) time. So, the question is - did I f%ck up somewhere or is he actually wrong about this? And if I am right, why this 'divide&conquer' approach can't be used instead of priority queue approach?

If you have a small number of lists to merge, this pairwise scheme is likely to be faster than a priority queue method because you have extremely few operations per merge: basically just one compare and two pointer reassignments per item (to shift into a new singly-linked list). As you've shown, it is O(N log K) (log K steps handling N items each).

But the best priority queue algorithms are, I believe, O(sqrt(log K)) or O(log log U) for insert and remove (where U is the number of possible different priorities)--if you can prioritize with a value instead of having to use a compare--so if you are merging items that can be given e.g. integer priorities, and K is large, then you're better off with a priority queue.

From your description, it does sound like your process is indeed O(N log K). It also will work, so you can use it.

I personally would use the first version with a priority queue, since I suspect it will be faster. It's not faster in the coarse big-O sense, but I think if you actually work out the number of comparisons and stores taken by both, the second version will take several times more work.

This is O(2*log(K)*N) this is O(N*log(K)) and you can't have worst complexity because you only 2N times add to priority queue in O(log(K)).

Or you can push all elements into vector in O(2N). And sort it in O(2n*log(2n)). Then you have O(2N+2N*Log(2N)), this is O(N*LOG(N)), exacly your K = N;

It runs indeed in O(N*log K), but don't forget, that O(N*log K) is a subset of O(N*K). I.e. your friend is not wrong either.