开发者

binary file formats: need for error correction?

开发者 https://www.devze.com 2023-01-01 20:10 出处:网络
I need to serialize some data in a binary format for efficiency (datalog where 10-100MB files are typical), and I\'m working out the formatting details. I\'m wondering if realistically I need to worry

I need to serialize some data in a binary format for efficiency (datalog where 10-100MB files are typical), and I'm working out the formatting details. I'm wondering if realistically I need to worry about file corruption / error correction / etc.

What are circumstances where file co开发者_JAVA百科rruption can happen? Should I be building robustness to corruption into my binary format? Or should I wrap my nonrobust-to-corruption stream of bytes with some kind of error correcting code? (any suggestions? I'm using Java) Or should I just not worry about this?

edit: preliminary binary format, as I have it right now, contains a bunch of variable-length segments, so I am slightly worried that if I do ever have data corruption then upon reading it back, I could get out of sync, and cannot recover + I lose the rest of the file.


You should at least add checksum. BER is good on modern hard drives, but this is not so for other media. Power loss during write usually corrupts file ends. If the data is important, you will need error correction codes, tripple and unbuffered writes, etc to commit transactions.

EXE do not have error correction, while single bit change can have drastic consequences.

If a file is to be transferred over TCP, you may assume zero errors.


I have seen it happen once or twice that a file transferred over the Internet became corrupted. You can do error detection using a checksum, such as SHA256.


You might be interested in the notes on error detecting codes in HDF5. Where and what kind of checksum depends on how you are accessing and updating the data as well as what is a useful chunk to detect an error in.


I went with a Reed-Solomon encoding system. There's a fairly easy-to-use Java implementation of it in Java in the Google zxing library.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号