开发者

md5 output in python different to command line (even in binary mode)

开发者 https://www.devze.com 2023-02-13 16:56 出处:网络
I\'m writing a script that needs to check the md5 sum of a file on OSX and Windows, and as a sanity check I compared the results with that of the command line md5 tool, but I get di开发者_开发技巧ffer

I'm writing a script that needs to check the md5 sum of a file on OSX and Windows, and as a sanity check I compared the results with that of the command line md5 tool, but I get di开发者_开发技巧fferent results. Here's the code

def MD5File(self, f, block_size=2**20):
  md5 = hashlib.md5()
  while True:
    data = f.read(block_size)
    if not data:
      break
    md5.update(data)
  return md5.hexdigest()

with open(path, 'rb') as f:
  print MD5File(path)

I did the obvious thing of opening the file in binary mode, but it still gives different results. I've tried different ways of buffering the data, including just reading it all in one go, and the python script consistently returns the same thing, but that's different to the md5 command.

So is there something else really obvious I'm doing wrong, or is it a case that running md5 filename doesn't actually do what you expect? As I'm reading the binary of the file directly there shouldn't be any newline issues. If I run cat filename | md5 then I get a different result again.


The following works correctly for me:

In [1]: with file("play.py") as f:
   ...:     data = f.read()
   ...:     from hashlib import md5
   ...:     print(md5(data).hexdigest())
   ...: 
07030b37de71f3ad9ef2398b4f0c3a3e

In [2]: 
bensonk@angua ~ $ md5 play.py
MD5 (play.py) = 07030b37de71f3ad9ef2398b4f0c3a3e

Please try my code and see if it works for you. If it doesn't, will you upload a gist of your python script and a sample file for me to try?


Oops, this was a case of user error. I had overriden the md5 command in the shell to just return the hash of a string rather than a file.

0

精彩评论

暂无评论...
验证码 换一张
取 消