Fast cross platform algorithm for reading/writing file in C++_问答_开发者

I want to pose a seemingly simple question that i can't find the answer nowhere. Is there a FAST modern algorithm for file input and/or output that can be compiled with all standard compliant C++ compilers and works for all operating systems without the requirement of external libraries ?

i have found that the fastest way is to use memory mapped files, but that won't do cause we want the same piece of code working on all platforms
we can't use apis like Win32 API cause that will make it platform specific
i don't want to use c, i want the algorithm to be just pure c++ code with stl if feasible, not some ugly c with intermixed asm hack/trick
frameworks or external libraries that don't belong in standart c++ shouldn't be used like wxWidgets, Qt, MFC etc.
The big empasis of this whole question is that the algorithm is as FAST as possible, something along the lines of the speed of doing it with memory mapped files, even faster would be great but i know that is not possible

Have you ever seen something so crazy been researched by anyone else except me ? Is such 开发者_Go百科an algorithm even possible ?

Thanks for any recommendations

With the following restrictions:

can be compiled with all standard compliant C++ compilers and works for all operating systems without the requirement of external libraries ?

You've pretty much restricted yourself to the standard library file IO functions. Maybe POSIX functions (depending on what subset of "all standard compliant C++ compilers" you're considering).

If they aren't fast enough for you, you'll have to start dropping some restrictions.

This has nothing to do with an "algorithm".

When it comes to writing data to a file, you're at the mercy of the operating system - memory-mapped files are "fast" because you're just writing to memory, and the OS syncs it back out on its own time. If the OS doesn't support it, you're out of luck in that regard - unless you want to implement your own memory-mapping layer.

Incidentally, POSIX has mmap, so if you're limiting yourself to POSIX-compliant systems, you're fine.

To put another perspective on "mercy of the OS", most of the overhead in copying files lies with the operating system. A fragmented file will take more time to read than a defragmented file. There are no generic or standard C++ functions for detecting fragmented files.

The fastest method in C++:

std::ifstream in_file;
std::ofstream out_file;

out_file << in_file.rdbuf();

You can find more details by searching the web with the keywords "copy file rdbuf". The above fragment leaves the copying up to the OS, but is portable across all platforms. By reading into the C++ i/o streams, you can set the size of the read buffer, or have it use your own buffer.

Faster file copying requires platform specific functionality, such as DMA transfers. Using threads and multiple buffering, could speed this up; but C++ has no support for threads (there is a defacto standard, POSIX, which does support threads). One thread would read into buffers while another thread writes from the buffers.

Some points:

This has nothing to do with algorithms.
Wanting to target ALL operating systems is not a very productive goal (and it's impossible
your code doesn't work on a particular platform until you've tested it). Instead, I'd focus on some set of operating systems that are feasible - say POSIX + Win32.
In that case, you can do memory mapping, for instance by implementing mmap() for Windows (on top of MapViewOfFile() etc -- the git source code has a mmap implementation for Windows if you need some inspiration)
If you can't use memory mapping, I'd recommend using the normal C file api instead of C++'s file streams if performance is a big deal. Even though C++'s streams have the potential of higher performance for some operations, in practice it is quite a bit slower.
But, to get good performance, it can often be "good enough" just to make sure you're treating your data in a sane way. Read data sequentially, don't re-read, etc. Perfect is the enemy of good ;)

Reading sequentially in blocks that are multiples (or powers of 2) of the file system block size may help. Then pick apart your data once the block is in memory. There is a white paper somewhere where they tested performance for various block sizes. I wish I could find it again.

You might also try having a dedicated thread for reading blocks from the file, and another that does the data manipulations in memory (with the proper synchronizations, of course). This allows you to use the CPU to process data while you're blocking on your file read calls.

Anyways, if you give these ideas a try, please let us know if you notice a difference. Actual results from your benchmark tests would be interesting.

Fast IO generally boils down to two things:

Minimize data copying
Minimize kernel/user context switching

Most all IO techniques attempt to address one or the other. The fastest cross-platform code for IO that I know of is the Perl IO system. I'd suggest taking a look at the source. Perl hackers have spent decades getting their IO as fast as possible on as many platforms as possible.

The other posters are correct in that performance is always at odds with genericity (cross-platform).

However, in general, you'll get the best results by "buffering" your input -- using fread() to read in relatively large chunks of data, and processing those.

I know that's a pretty basic and general answer, but that's about as specific as you can get without either being more platform-specific, or knowing more about the specific input you're processing.