开发者

Hashing large files on background thread

开发者 https://www.devze.com 2023-04-03 04:27 出处:网络
I have a Windows Forms application that hashes files asynchronously using a BackgroundWorker. I\'ve implemented cancellation by checking for CancellationPending between each file being hashed. The has

I have a Windows Forms application that hashes files asynchronously using a BackgroundWorker. I've implemented cancellation by checking for CancellationPending between each file being hashed. The hashing itself is essentially this:

var sha1 = new SHA1CryptoServiceProvider();
byte[] hash = s开发者_Python百科ha1.ComputeHash(
    new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite));

The only problem with that is for particularly large files - hundreds of megabytes or gigabytes in size - the hashing operation blocks the cancellation until it is complete for that file.

What would be the best way to modify this so that cancellation could be checked while the file is being hashed - for example every N milliseconds or every N bytes?


You can create you own cancellable stream and provide this as the input to the hashing function. Something along these lines:

class CancellableFileStream : FileStream {

  readonly BackgroundWorker backgroundWorker;

  public CancellableFileStream(BackgroundWorker backgroundWorker, String path, FileMode mode, FileAccess access, FileShare share)
    : base(path, mode, access, share) {
    this.backgroundWorker = backgroundWorker;
  }

  public override Int32 Read(Byte[] array, Int32 offset, Int32 count) {
    if (this.backgroundWorker.CancellationPending)
      return 0;
    return base.Read(array, offset, count);
  }

}


Use TransformBlock and TransformFinalBlock instead of ComputeHash, pumping data from your stream to the hash algorithm manually - then insert a cancellation check in your loop.


SHA1 is chunk-friendly. Read the file by chunks, use TransformBlock(), then TransformFinalBlock() when the file ends.

0

精彩评论

暂无评论...
验证码 换一张
取 消