I am trying to download all the wmv files that have the word 'high' on their name, in a website using wget with the following command:
wget -A "*high*.wmv" -r -H -l1 -nd -np -erobots=off http://mywebsite.com -O yl-`date +%H%M%S`.wmv
The file starts and finishes downloading but just after it downloads I get
Removing yl-120058.wmv since it should be rejecte开发者_如何学编程d.
- Why is that and how could I avoid it?
- How could I make the command to spider the whole website for those type of files automatically?
It's because the accept list is being checked twice, once before downloading and once after saving. The latter is the behavior you see here ("it's not a bug, it's a feature"):
Your saved file yl-120058.wmv does not match your specified pattern -A "high.wmv" and will be thus rejected and deleted.
Quote from wget manual:
Finally, it's worth noting that the accept/reject lists are matched twice against downloaded files: [..] the local file's name is also checked against the accept/reject lists to see if it should be removed. [..] However, this can lead to unexpected results.
精彩评论