开发者

web crawlering help required

开发者 https://www.devze.com 2022-12-21 06:20 出处:网络
hi i am completin开发者_JS百科g a little hobby project of mine to create a small scale search engine.

hi i am completin开发者_JS百科g a little hobby project of mine to create a small scale search engine.

i was wondering if any one knows of a decent robust opensource web crawler that they have used? it should be easy for a noob to setup and use.

thank you for not googling web crawlers and pasting a list .


crawler4j is a pretty decent crawler, multi-threaded and easy to configure and use. It's written in Java.

You can find a list of open-source crawlers in this wikipedia page.


I think you should read a similar experience.

http://infolab.stanford.edu/~backrub/google.html

0

精彩评论

暂无评论...
验证码 换一张
取 消