开发者

robots.txt disallow: spider

开发者 https://www.devze.com 2023-03-31 01:48 出处:网络
I\'m looking at a robots.txt file of a site I would like to do a one off scrape and there is this line:

I'm looking at a robots.txt file of a site I would like to do a one off scrape and there is this line:

User-agent: spider

Disallow: /

Does this mean they don't want any spiders? I was under the impression that * was used for all spiders. If tru开发者_如何转开发e this would of-course stop spiders such as google.


This just tells to agents that call themselves spider to be gently enough to not browse the site.

This has no special meaning.

robots.txt files are used only by robots, so a way to exclude all robots is to use a *:

User-Agent: *
Disallow: /
0

精彩评论

暂无评论...
验证码 换一张
取 消