I've set up robots.txt to stop google crawling my site, since it's under development:
# robots.txt
User-agent: *
Disallow: /
But, the log indicates that Googlebot visits various and random parts of my site. Just a single page at a time. Why do they do this and how can I prevent them from doing it?
Log extract:
66.249.72.174 - - [07/May/2011:08:12:11 -0700] "GET /?page开发者_如何学JAVA=2&atype=new&filter=h HTTP/1.1" 200 10156 - "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html),gzip(gfe),gzip(gfe),gzip(gfe)"
I do want google to crawl my site, just not yet.
精彩评论