开发者

external sort with k-way merging vs quick sort

开发者 https://www.devze.com 2023-01-16 17:02 出处:网络
Which one is better? Say 1GB memory and 100GB file to sort. One instance of 10-way merging needs: - 100 1GB loads followed by 10*10 + 10*100 100MB loads (for 10-way followed by 10-way merging)

Which one is better? Say 1GB memory and 100GB file to sort.

One instance of 10-way merging needs: - 100 1GB loads followed by 10*10 + 10*100 100MB loads (for 10-way followed by 10-way merging)

Quicksort needs 100开发者_运维问答*7*2 (nlogn) 1GB loads?


Merge sort is more IO efficient when processing large data.

The reason is because quick sort is a Top-bottom approach, which means you have to process the 100GB first, and than process 50GB * 2 ... it is impossible to fit whole data into memory when you have large data.

in other way, merge sort is a bottom-up approach , as you described, you can separate data into small batch which can fit into memory, and merge them in buffer.


The main bottleneck will actually be reading from and writing to the hard drive. We read each element from the hard drive twice and write each element from the hard drive twice. Once each for sorting the chunks and then once each again for the multi-way merge.

In contrast, quicksort will read/write each element to the harddrive on average O(log n) times.

0

精彩评论

暂无评论...
验证码 换一张
取 消