开发者

Will Nutch, the spider, index webpages it already has in it's index?

开发者 https://www.devze.com 2023-02-16 21:38 出处:网络
Does Nutch index pages again if they\'re already in 开发者_运维技巧the index? If so, how do I change this?Yes and no. By default Nutch will reindex pages only after a certain period 1 month (from memo

Does Nutch index pages again if they're already in 开发者_运维技巧the index? If so, how do I change this?


Yes and no. By default Nutch will reindex pages only after a certain period 1 month (from memory), if the page hasn't change it will delay increase the re-indexing time too a maximum which is 3 month by default. All settings are configurable in nutch-site.xml

0

精彩评论

暂无评论...
验证码 换一张
取 消