开发者

Yahoo Web Scrapes: What are the limits?

开发者 https://www.devze.com 2022-12-27 19:31 出处:网络
We are using a web scraper and have it set up to have a sleep function which has a ra开发者_如何学编程ndom function set up (so that it isn\'t the same time between each scrape) but we are still gettin

We are using a web scraper and have it set up to have a sleep function which has a ra开发者_如何学编程ndom function set up (so that it isn't the same time between each scrape) but we are still getting blocked from Yahoo after 20-30 requests.

Does any one know if there is a limit (i.e: 20 requests per minutes, 200 an hour) Right now our average between each request is around 3-6 seconds. Thanks for any help


1 request every 3-6 seconds is quite low so perhaps there is another problem with your crawler.

A few ideas:

  • set the User-Agent to something non-suspicious
  • set the Referer header to the same domain
  • try running your crawler from a different IP in case your current IP is blacklisted
  • try maintaining cookies

This will all be easier if you use a higher level library like Mechanize.


So the answer is 5000 queries. Taken from

http://forums.digitalpoint.com/showthread.php?t=736784

http:// developer. yahoo. com/search/rate.html

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号