开发者

Sort at most 10 million 7-digit numbers. constraints: 1M RAM, high speed. several secs is good

开发者 https://www.devze.com 2023-02-12 01:24 出处:网络
Sort at most 10 million 7-digit numbers. constraints: 1M RAM, high speed. several secs is good. [Edit: from comment by questioner: the input values are distinct]

Sort at most 10 million 7-digit numbers. constraints: 1M RAM, high speed. several secs is good.

[Edit: from comment by questioner: the input values are distinct]

Using Bitmap Data Structure is a good solution for this problem.

That mea开发者_StackOverflow社区ns I need a string, which length is at most 10 Million???? So is the RAM enough for this? confused here. Thank you


So, there are ~8,000,000 bits in 1MB but if you have arbitrary 7 digit numbers (up to 9,999,999) using a bit vector to do the sort won't work. Similarly, it won't work if some numbers can be repeated because you can only store {0,1} in a bit vector.

But assuming, (what I think your problem is asking) that you have a sequence of integers between 0 and 8,000,000 with no duplicates, you can simply allocate a zeroed array of 8,000,000 bits and then for each number, tick off the corresponding bit in the array. Then outputting the sorted list is simply reading through that array in sequence and outputting the index for each 1 value.

If you are asking the more complex version of the question (0 - 10 million, repeats allowed), then you will need to to sort chunks that fit in ram, store them on disk, and then you can merge these chunks in linear time and streaming (so you don't ever have to store the whole thing in memory). Here is an implementation of a very similar thing in python: http://neopythonic.blogspot.com/2008/10/sorting-million-32-bit-integers-in-2mb.html


Start with a bit array representing the lowest 8 million possible values. Read through the input and set a bit for every value within range. Then output all the numbers for the turned-on bits in sequence. Next Clear the first 2 million bits of the array so that it can represent the highest 2 million possible values. Read through the input and set a bit for every value in the new range. Output all the values in this range. Done.

0

精彩评论

暂无评论...
验证码 换一张
取 消