I have a list of URLs which I would like to feed into wget using --input-file.
However I can't work开发者_开发问答 out how to control the --output-document value at the same time, which is simple if you issue the commands one by one. I would like to save each document as the MD5 of its URL.
cat url-list.txt | xargs -P 4 wget
And xargs is there because I also want to make use of the max-procs features for parallel downloads.
Don't use cat
. You can have xargs
read from a file. From the man
page:
--arg-file=file -a file Read items from file instead of standard input. If you use this option, stdin remains unchanged when commands are run. Other‐ wise, stdin is redirected from /dev/null.
how about using a loop?
while read -r line
do
md5=$(echo "$line"|md5sum)
wget ... $line ... --output-document $md5 ......
done < url-list.txt
In your question you use -P 4 which suggests you want your solution to run in parallel. GNU Parallel http://www.gnu.org/software/parallel/ may help you:
cat url-list.txt | parallel 'wget {} --output-document "`echo {}|md5sum`"'
You can do that like this :
cat url-list.txt | while read url; do wget $url -O $( echo "$url" | md5 ); done
good luck
精彩评论