I am using os.walk
to compare two folders, and see if they contain the exact same files. However, this only ch开发者_运维问答ecks the file names. I want to ensure the file sizes are the same, and if they're different report back. Can you get the file size from os.walk
?
The same way you get file size without using os.walk
, with os.stat
. You just need to remember to join with the root:
for root, dirs, files in os.walk(some_directory):
for fn in files:
path = os.path.join(root, fn)
size = os.stat(path).st_size # in bytes
# ...
os.path.getsize(path) can give you the filesize of the file, but having two files the same size does not always mean they are identical. You could read the content of the file and have an MD5 or Hash of it to compare against.
As others have said: you can get the size with stat
. However for doing comparisons between dirs you can use dircmp.
FYI, there is a more efficient solution in Python 3:
import os
with os.scandir(rootdir) as it:
for entry in it:
if entry.is_file():
filepath = entry.path # absolute path
filesize = entry.stat().st_size
See os.DirEntry for more details about the variable entry
.
Note that the above is not recursive (subfolders will not be explored). In order to get an os.walk
-like behaviour, you might want to use the following:
from collections import namedtuple
from os.path import normpath, realpath
from os.path import join as pathjoin
_wrap_entry = namedtuple( 'DirEntryWrapper', 'name path islink size' )
def scantree( rootdir, follow_links=False, reldir='' ):
visited = set()
rootdir = normpath(rootdir)
with os.scandir(rootdir) as it:
for entry in it:
if entry.is_dir():
if not entry.is_symlink() or follow_links:
absdir = realpath(entry.path)
if absdir in visited:
continue
else:
visited.add(absdir)
yield from scantree( entry.path, follow_links, pathjoin(reldir,entry.name) )
else:
yield _wrap_entry(
pathjoin(reldir,entry.name),
entry.path,
entry.is_symlink(),
entry.stat().st_size )
and use it as
for entry in scantree(rootdir, follow_links=False):
filepath = entry.path
filesize = entry.size
精彩评论