I'm writing a script to copy compiled files from one location to another.
What I have at the moment is something like this:
import os
import shutil
shutil.copy2 (src, dst)
#... many more shutil.copy commands
#src is a filename string
#dst is the directory where the file is to be copied
My problem is that many of the files being copied are large files, and not开发者_运维知识库 all of them are re-compiled in every compile cycle. Ideally, I would like to copy only the changed files in this script. Is there any way I can do this?
You can do a smart copy using distutils.file_util.copy_file by setting the optional argument: update=1
.
There is also a version that copies entire directories with distutils.dir_util.copy_tree
.
You can then verify that either of those is actually working and only copying required files by enabling logging:
import distutils.log
import distutils.dir_util
distutils.log.set_verbosity(distutils.log.DEBUG)
distutils.dir_util.copy_tree(
src_dir,
dst_dir,
update=1,
verbose=1,
)
which prints which files were copied.
You could make use of the file modification time, if that's enough for you:
# If more than 1 second difference
if os.stat(src).st_mtime - os.stat(dest).st_mtime > 1:
shutil.copy2 (src, dst)
Or call a synchronization tool like rsync.
you could give this python implementation of rsync a try
http://freshmeat.net/projects/pysync/
If you don't have a definite reason for needing to code this yourself in python, I'd suggest using rsync. From it's man-page:
Rsync is a fast and extraordinarily versatile file copying tool. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination.
If you do wish to code this in Python, however, then the place to begin would be to study filecmp.cmp
How would you like to look for changed files? You can just use os.path.getmtime(path) on the src and check whether that's newer than some stored timestamp (the last time you copied for instance) or use a filecmp.cmp(f1, f2[, shallow]) to check whether a file is newer.
Watch out with filecmp.cmp, you also copy the stat (copy2) so you have to check wether a shallow compare is good enough for you.
To build on AndiDog's answer, if you have files that might not exist in the destination folder:
# copy file if destination is older by more than a second, or does not exist
if (not os.path.exists(dest)) or (os.stat(src).st_mtime - os.stat(dest).st_mtime > 1) :
shutil.copy2 (src, dest)
From AndiDog's answer:
os.stat(dst).st_mtime - os.stat(src).st_mtime
is a negative value if 'src' file is newer, so it should be:
if os.stat(src).st_mtime - os.stat(dst).st_mtime > 1:
精彩评论