Ways of Efficiently Seeking in Custom File Formats_问答_开发者

Ways of Efficiently Seeking in Custom File Formats

开发者 https://www.devze.com 2023-01-05 03:20 出处：网络

I\'ve been wondering what kind of ways seek is implemented acros开发者_Python百科s different file formats and what would be a good way to construct a file that has a lot of data to enable efficient se

I've been wondering what kind of ways seek is implemented acros开发者_Python百科s different file formats and what would be a good way to construct a file that has a lot of data to enable efficient seeking. Some ways I've considered have been having equal sized packets, which allows quick skipping since you know what each data chunk is like, also preindexing whenever a file is loaded is also a thought.

This entirely depends on the kind of data, and what you're trying to seek to.

If you're trying to seek by record index, then sure: fixed size fields makes life easier, but wastes space. If you're trying to seek by anything else, keeping an index of key:location works well. If you want to be able to build the file up sequentially, you can put the index at the end but keep the first four bytes of the file (after the magic number or whatever) to represent the location of the index itself (assuming you can rewrite those first four bytes).

If you want to be able to perform a sort of binary chop on variable length blocks, then having a reasonably efficient way of detecting the start of a block helps - as does having next/previous pointers, as mentioned by Alexander.

Basically it's all about metadata, really - but the right kind of metadata will depend on the kind of data, and the use cases for seeking in the first place.

Well, giving each chunk a size offset to the next chunk is common and allows fast skipping of unknown data. Another way would be an index chunk at the beginning of the file, storing a table of all chunks in the file along with their offsets. Programs would simply read the index chunk into memory.