开发者

Is there a fast way of adding or removing content in the middle of a very large file

开发者 https://www.devze.com 2023-02-28 04:39 出处:网络
Say I have a very large file (say > 1GB) and I want to add a single character in the middle of it. Is it possible to do this without reading and writing the whole file out? My current solution is this

Say I have a very large file (say > 1GB) and I want to add a single character in the middle of it. Is it possible to do this without reading and writing the whole file out? My current solution is this (in pseudocode):

x = 0
chunk = read 4KB chunk x of input file
if chunkToEdit = x, chunk = addCharacter(chunk)
append chunk to the output file 
x = x + 1
repeat last 4 steps until input file is fully read
delete input file
move output file to input file

While that works, it results in 1GB of reading, an开发者_StackOverflowd 1GB of writing to make a single character change. It also requires a spare 1GB of disk space. What I would rather do is modify the part of the file that needs to be changed in place, so I only have to read and write one part of the file (ie 4KB of reading, and 4KB of writing). Is this possible (or a solution better than my one)?

I thought a solution for this could be possible by the OS fragmenting the file and making a new fragment for the changed section, but I don't know if this capability has been written and exposed to developers.


No. Files don't work like that. If you need to change the size of the file then you need to operate from the modification point to the end.

Unless you're using a file format that can handle insertions/deletions cleanly, but it sounds like you aren't.


Adding a single character in the middle necessarily requires shifting everything after this one character by one character. This necessarily requires that you read and write everything from the point of insertion to the end of the file. A way that uses as little memory as possible to do so would be:

  • i = 0
  • read last (n byte * i) of file
  • write back to file shifted by 1 character
  • i++
  • repeat until reaching the point of insertion
  • write single character

In other words: shift everything in chunks of n bytes by one character starting from the end going backwards through the file to the point of insertion, then insert the character. The farther back in the file you want to insert the character, the faster this will be. If you often want to insert near the beginning of the file, this may not be the best solution.

0

精彩评论

暂无评论...
验证码 换一张
取 消