开发者

How do I disallow a folder in robots.txt but except for certain files? [closed]

开发者 https://www.devze.com 2023-01-12 10:23 出处:网络
Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update开发者_StackOverflow中文版 the question so it's on-topic for Stack Overf
Closed. This question is off-topic. It is not currently accepting answers.

Want to improve this question? Update开发者_StackOverflow中文版 the question so it's on-topic for Stack Overflow.

Closed 10 years ago.

Improve this question

I have a situation where I want to disallow the crawling of certain pages within a directory. This directory contains a large number of files but there are a few files that I need to still be indexed. I will have a very large robots file if I need to go through disallowing each page individually. Is there a way to disallow a folder in robots.txt except for certain files?


There is a non-standard extension to the robots.txt format for specifying "Allow" rules. Not every bot honors it, and some bots process them differently than others.

You can read more about it in this Wikipedia article: http://en.wikipedia.org/wiki/Robots_exclusion_standard#Allow_directive


To get that sort of fine grained control, you might be better off using a robots meta tag in your HTML. That is assuming the files in questions are all HTML.

<meta name="robots" content="noindex" />

This should be placed in the head of your document.

I find these tags easier to maintain than robots.txt as well.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号