Is it possible to configure Solr so that the document similarity score would be in the range for e开发者_如何学Cxample from 0 (no match) to 1 (complete document and query match).
Thanks!
No, tf-idf doesn't work like that, and conceptually search doesn't really work like that. How would one define a 'complete match'?
You need this for some kind of UI meter? Maybe you should look at cosine similarity between documents, http://en.wikipedia.org/wiki/Cosine_similarity , where the first document is the query.
It should be possible, you need to change lucene ranking function (solr is using lucene internally). You can replace the default implementation. I do not know how much time you need to make it running but -- if you really need a boolean retrieval engine -- you can do it. You should start your investigations from this document.
I am not sure what for you need such a functionality, but I suppose that maybe you want to use Solr as a key value store. In such a case, you need to change your indexing configuration - your analyzer should not tokenize input text. If so, the text will be placed in the index without modification (the same analyzer is used for processing queries). Thus if you provide in the query a key ("1234" for a field "MY_KEY"), you will get a corresponding document for this key.
No, I am not really talking about boolean queries, but thank you for the resource on Lucene Similarity & Scoring.
Well, I'm thinking along the lines of Language Models for information retrieval & wondering if anyone know if there is an implementation for this in lucene/solr
http://nlp.stanford.edu/IR-book/html/htmledition/language-models-for-information-retrieval-1.html
精彩评论