I'm using solr's faceting and i've run into a problem that i was hoping i could get around using filters.
Basically some times a town name will come through to SOLR as
"CAMBRIDGE"
and sometime's it will come through as
"Cambridge"
I wanted to use a filter in Solr to stop the SCREAMING CAPS version of the town name. It seems there is a fitler to make all the text lower case.
<!-- A text field that only sorts out casing for faceting -->
<fieldType name="text_facet" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/&开发者_JAVA技巧gt;
</analyzer>
</fieldType>
I was wondering if anyone knew of a filter which will Ignore the First character of a word and apply lowercase to the rest of the characters. E.g.
- CAMBRIDGE >> Cambridge
- KingsTON Upon HULL >> Kingston Upon Hull
etc
Alternatively if it's easy to write your own filters.. some help on how to do that would be appreciated.. I'm not a Java person..
Thanks
AFAIK there is no built-in filter like that. If you want to write it, see LowerCaseFilterFactory and LowerCaseFilter for reference, it doesn't seem to be very hard.
Or you could do this client-side, i.e. in SolrNet you could write a ISolrOperations decorator that does the necessary transformations after the real query, using ToTitleCase.
Perhaps you could make use of the solr.PatternReplaceCharFilterFactory?
<fieldType name="textCharNorm" class="solr.TextField">
<analyzer>
<filter class="solr.LowerCaseFilterFactory"/>
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="([^\s]{1})([^\s]*)" replaceWith="\U$1\L$2"/>
</analyzer>
</fieldType>
Notice, I haven't tested the code or solr.PatternReplaceCharFilterFactory, so I'm not sure if it works. If you need to build your own filter this guide might be useful:
http://robotlibrarian.billdueber.com/building-a-solr-text-filter-for-normalizing-data/
// John
精彩评论