- Given a root node, which should start producing a tree with about 1010 (ab. 234) nodes, is it appropria开发者_如何学Cte to use a memory-mapped file which once will contain the whole tree?
- What operating-system-related problems may occur (file I/O, huge file support)?
- Have C, gcc and glibc some implicit limits (pointers)?
- Has Linux any issues/limits with large files?
As yi_H mentioned in his comment, you'll want a 64 bit operating system and a file system that supports large files. Assuming each node contains on the order of 2^5=32 bytes of data, 2^40 nodes will result in 2^45 bytes = 32 terabytes. Now assuming you're not running on a modern military fighter plane, you'll need to map most of that data to the hard disk.
Once the data is on your disk and the file system is properly configured, I don't think there will be problems with any system limitations. However read/write speed will definitely an issue. Given an average IO speed of 100 mb/s on your hard drive, it would take about 4-5 days to just traverse the entire tree.
It would be better to divide the data up onto multiple computers and parallelize your operations.
精彩评论