开发者

Editing a 10gb file using limited main memory in C/C++

开发者 https://www.devze.com 2023-02-07 07:25 出处:网络
I need to sort a 10gb file containing a list of numbers as fast as possible using only 100mb of memory.

I need to sort a 10gb file containing a list of numbers as fast as possible using only 100mb of memory. I'm breaking them into chunks and then merging them.

I am currently using C File pointers as they go faster than c++ file i/o(atleast on my system).

I tried for a 1gb file and my code works fine, but it throws a segmentation fault as soon as I fscanf after opening the 10gb file.

FILE *fin;
FILE *fout;
fin = fopen( filename, "r" );
while( 1 ) {
    // throws the error here
    for( i = 0; i < MAX &&开发者_Python百科amp; ( fscanf( fin, "%d", &temp ) != EOF ); i++ ) {
        v[i] = temp;
    }

What should I use instead?

And do you have any suggestions about how to go about this in the best way possible?


There is a special class of algorithms for this called external sorting. There is a variant of merge sort that is an external sorting algorithm (just google for merge sort tape).

But if you're on Unix, it's probably easier to run the sort command in a separate process.

BTW. Opening files that are bigger than 2 GB requires large file support. Depending on your operating system and your libraries, you need to define a macro or call other file handling functions.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号