I have a Solr schema with a kind of versioning. IDs contain version number, so existing docs remain as new are indexed. Sample contents:
id = foo1
name = foo
version = 1
data = x
id = foo2
name = fo开发者_运维百科o
version = 2
data = y
id = bar1
name = bar
version = 1
data = x
There are two distinct search scenarios: Search all versions or search only the latest. The first is trivial, but how do I implement a search in the data
field for only the latest versions of each name
? In the sample above I wish to search for "x" in latest, and expect to hit only "bar1".
I was hoping for a solution using http://wiki.apache.org/solr/FieldCollapsing, but if I search for "x" with group.field=name
Solr will group after search, giving me version 1 of the two names above. I would need it to work more like a filter query.
Dont think field collapsing would serve you the purpose.
I can think of couple of the options -
- Generate an unique same id for the document, so that when you add the new current document the old one is overwritten and you have only one version of the document always.
- If its possible to maintain an extra field for the documents which would indicate the status as CURRENT. Only the latest document would have the field value and you would need to reset the value for all the other version of the documents. This way you can easily filter out the latest documents by filter query and also search through all version with the filter query.
精彩评论