I have two directories of images with mismatching names, but mostly matching images.
Dir 1 Size | Dir 2 Size
---------------------------------------------------
img1.jpg 508960 | a_image_name.jpg 1038644
img2.jpg 811430 | another_image_name.jpg 396240
... ... | ... ...
img1000.jpg 602583 | image开发者_StackOverflow_name.jpg 811430
... ... |
img2000.jpg 396240 |
The first directory has more images, but is misnamed. The second directory has the correct names, but not corresponding in order to the first directory.
I'd like to rename files in Dir 1 by comparing file size (or some other way) to Dir 2. In the above example img2.jpg would be renamed to image_name.jpg because both have the same file size.
Can you point me in the right direction?
Preferably by way of app (Mac), shell, or php.
Maybe it would be wiser to use hashes of the files instead of using the filesize?
In short: using glob(), get a list of files in dir1, iterate, create md5-hash (md5() + file_get_contents()), store in an array, using the hash as key and the filename as value. Do the same for dir2.
iterate array1, if an entry with the same hash exists in array2 rename file
Code will be something like this: (untested, unoptimized)
$dir1 = array();
$dir2 = array();
// get hashes for dir1
foreach( glob( '/path/to/dir1/*.jpg' ) as $file ) {
$hash = md5( file_get_contents( $file ) );
$dir1[ $hash ] = $file;
}
// repeat for dir2 ...
foreach( $dir1 as $hash => $file1 ) {
if( array_key_exists( $hash, $dir2 ) ) {
rename( $file1, $dir2[ $hash ] );
}
}
Here is my solution, which rename files in dir1 based on file size.
Contents of dir1:
-rw-r--r-- 1 haiv staff 10 Aug 16 13:18 file1.txt
-rw-r--r-- 1 haiv staff 20 Aug 16 13:18 file2.txt
-rw-r--r-- 1 haiv staff 30 Aug 16 13:18 file3.txt
-rw-r--r-- 1 haiv staff 205 Aug 16 13:18 file4.txt
(Note the fifth column stores the file sizes.) And the contents of dir2:
-rw-r--r-- 1 haiv staff 30 Aug 16 13:18 doc.txt
-rw-r--r-- 1 haiv staff 205 Aug 16 13:18 dopey.txt
-rw-r--r-- 1 haiv staff 20 Aug 16 13:18 grumpy.txt
-rw-r--r-- 1 haiv staff 10 Aug 16 13:18 happy.txt
Create a file call ~/rename.awk (yes, from the home directory, to avoid polluting either dir1 or dir2):
/^total/ {next} # Skip the first line (which contains the total, of ls -l)
{
if (name[$5] == "") {
name[$5] = $NF
print "# File of size", $5, "should be named", $NF
} else {
printf "mv '%s' '%s'\n", $NF, name[$5]
}
}
Now, cd into dir1 (if you want to rename files in dir1), and issue the following command:
$ awk -f ~/rename.awk <(ls -l ../dir2) <(ls -l)
Output:
# File of size 30 should be named doc.txt
# File of size 205 should be named dopey.txt
# File of size 20 should be named grumpy.txt
# File of size 10 should be named happy.txt
mv 'file1.txt' 'happy.txt'
mv 'file2.txt' 'grumpy.txt'
mv 'file3.txt' 'doc.txt'
mv 'file4.txt' 'dopey.txt'
Once you are happy with the result, pipe the above command to sh to execute the changes:
$ awk -f ~/rename.awk <(ls -l ../dir2) <(ls -l) | sh
Notes:
- No safeguard against files with the same size. For that, the MD5 solution which wonk0 offered works better.
- Please examine the output before you commit. Changes are permanent.
精彩评论