I assign a custom "popularity" score for each document in my Solr database. I want search results to be ordered by this custom "score" field rather than the built-in relevancy score that is the default.
First I define my score field:
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
<field name="score" type="sint" stored="true" multiValued="false" />
Then I rebuild the index, inserting a score for each document. To run a query, I use something like this:
(text:hello)+_val_:"score"
Now I would expect the documents to come back sorted by the "score" field, but what I get instead is:
<doc>
<int name="score">566</int>
<str name="text">SF - You lost me at hello...</str>
</doc>
<doc>
<int name="score">41</int>
<str name="text">hello</str>
</doc>
<doc>
<int name="score">77</int>
<str name="text">
CAGE PAGE-SAY HELLO (MIKE GOLDEN's Life Is Bass Remix)-VIM
</str>
</doc>
<doc>
<int name="score开发者_如何学Go">0</int>
<str name="text">Hello Hello Hello</str>
</doc>
Notice that the scores come back out of order: 566, 41, 77, 0. The weird thing is that it only sorts this way with certain queries. I'm not sure what the pattern is, but so far I've only see the bad sorting when scores of "0" come back in the search results.
I've tried IntField instead of SortableIntField, and I've tried putting "sort=score desc" as a query parameter, with no change in behavior.
Am I doing something wrong, or just misunderstanding the meaning of using val:"score" in my query?
EDIT: I tried renaming the "score" field to "popularity" and got the same result.
score field is used by Solr internally, so may be its not a good practice to define a field with the same field name.
you can try defining a field with different field name and both the options you mentioned should work fine.
Edit - This is what i have and works fine (Solr 3.3)
Schema -
Field Type -
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
Field -
<field name="popularity" type="int" indexed="true" stored="true" />
Data -
<add>
<doc>
<field name="id">1007WFP</field>
<field name="popularity">566</field>
<field name="text">SF - You lost me at hello...</field>
</doc>
<doc>
<field name="id">2007WFP</field>
<field name="popularity">41</field>
<field name="text">hello</field>
</doc>
<doc>
<field name="id">3007WFP</field>
<field name="popularity">77</field>
<field name="text">
CAGE PAGE-SAY HELLO (MIKE GOLDEN's Life Is Bass Remix)-VIM
</field>
</doc>
<doc>
<field name="id">4007WFP</field>
<field name="popularity">0</field>
<field name="text">Hello Hello Hello</field>
</doc>
</add>
Query -
http://localhost:8983/solr/select?q=*:*&sort=popularity%20desc
Results :-
<result name="response" numFound="4" start="0">
<doc>
<str name="id">1007WFP</str>
<int name="popularity">566</int>
</doc>
<doc>
<str name="id">3007WFP</str>
<int name="popularity">77</int>
</doc>
<doc>
<str name="id">2007WFP</str>
<int name="popularity">41</int>
</doc>
<doc>
<str name="id">4007WFP</str>
<int name="popularity">0</int>
</doc>
</result>
The _val_ hack actually ADDS the "popularity" field to the normally computed score of solr.
So, if you have popularity=41 on document A and popularity=77 on document B, but document A scores more than 36 points better than B for the keyword "hello", then they'll get sorted with A before B.
Use the "sort" field (as you did) that completely overrides normal sorting by score.
An alternative way could be to use a filter query (parameter fq instead of q), that filters matching document without computing any score, and then use _val_ to define your scoring formula. Since with filter queries all retrieved documents will have a score of zero, _val_ would be unaffected and behave as you originally expected.
精彩评论