I just went through solr wiki page for clustering. But i am not getting what is the benefit of using clustering. Can anyone tell me what is actually clusering and what its use in indexing and searching.
Please rep开发者_如何学Pythonly
Clustering is a statistical technique to group data in to groups 'which belong together'. In Solr specifically, this means that it will try to group the results for a certain query and label those groups.
This could give you additional information in the nature of the results returned. Example: if you search for 'Python' on a very broad set of documents, the clustering component might create groups for 'The Python programming language', 'Python the snake', etc.
Have a look at the Carrot2 demo site for a demo: (Carrot2 is the clustering engine shipped with Solr)
http://search.carrot2.org/stable/search
Solr's clustering components (Carrot2) clusters the documents using the text fields which are returned by Solr in a result list. (The fields used are configurable.) It uses the terms in the text field to build the clusters and label them.
There is a very interesting presentation on the Carrot2 website:
http://project.carrot2.org/publications/carrot2-dresden-2007.pdf
精彩评论