开发者

Is there something special about wget?

开发者 https://www.devze.com 2023-04-05 03:38 出处:网络
I was trying to download this web pagehttp://maps.googleapis.com/maps/api/geocode/xml?address=Coi开发者_如何学运维mbatore+&sensor=true (Google maps api) using bash command wget. But the response I

I was trying to download this web page http://maps.googleapis.com/maps/api/geocode/xml?address=Coi开发者_如何学运维mbatore+&sensor=true (Google maps api) using bash command wget. But the response I got for this was a page that informed me that the the request was denied. I tried to download the same using Python urllib functions, which was a success. So what is so special about wget? or am I missing something?


Bash has a special meaning for the & character. You either need to proceed it with a backslash ( \ ) or wrap the entire URL in single quotes ( ' ).


This is because the special chars are interpreted by bash (? for instance, and ̀ &` that makes the process in background). Just wrap it arround simple quotes, and it should work.


I can imagine this is because of robots.txt

You could tweak the UserAgent to (potentially) get past

If you have permission (!!!) of the website owner, you could ignore robots.txt by passing -erobots=off to wget, so:

wget -erobots=off \
    'http://maps.googleapis.com/maps/api/geocode/xml?address=Coimbatore+&sensor=true'
0

精彩评论

暂无评论...
验证码 换一张
取 消