What possibly can i do with Hadoop and Nutch used as a search engine ? I know that nutch is used to build a web crawler . But i'm not finding the perfect picture . Can i use mapreduce with nutch and do some mapreduce job ? Any ideas are welcome . Few link开发者_如何学Pythons will be greatly appreciated . Thanks.
If you want to only do Map/Reduce jobs you don't need Nutch but Hadoop only. Hadoop brings you a cluster file system and a scheduler for map/reduce jobs on the filesystem.
As Nutch builds on top of Hadoop you can create your own map/reduce jobs on Nutch data as long as you understand the data structure and what the crawler is doing.
However if you only wanted to run some map/reduce jobs, just install hadoop and off you go.
精彩评论