Situation:
I have a C# program which does the following:
- Generate many files (replacing the ones generated last time the program ran.)
- Read those files and perform a time-consuming computation.
Problem:
I only want to perform the time-consuming computation on files which have actually changed since the last time I ran the program.
Solution 1:
- Rename the old file.
- Write the new file.
- Read and compare both files.
This involves writing one file and reading two, which seems like more disk access than necessary.
Solution 2:
- Write to a string instead of a file.
- Read the old file and compare to the string.
- If they are different, overwrite the o开发者_JS百科ld file.
This would involve reading one file and possibly writing one, which seems like a big improvement over my first idea.
Question:
Can you describe a better way to solve my problem? (and explain why it is better?)
One solution could be to generate some sort of checksum from the contents of the file. Then when you generate a new contents you only need to compare the checksum values to see if the files have changed.
Store the checksum as the first record in the file (or at least fairly near the start of the file) to minimise the amount of data you have to read.
If you could somehow store the checksum as an attribute of the file (rather than in the file itself) you wouldn't even need to open the old file. Another alternative would be to store the checksum and the file it referred to in another central file or database, but there is the danger that could get out of step.
At the end of each run, save the execution time to a file.
During the next run, after you've created all the new files, use DirectoryInfo to iterate through the files in the directory and check each files's GetLastWriteTime (http://msdn.microsoft.com/en-us/library/system.io.file.getlastwritetime.aspx) against the stored execution time. If the LastWriteTime is after the saved time, that file was modified by the current execution, so you have to process it.
精彩评论