I'm using solr 1.4 and solr 4 for fulltext-search inside documents. At the moment I'm unable to search whole phrases, like "The dog runs" at the textblock: "The dog runs through the house." For this testcase I use an simple solr URL: http://plocalhost:8088/solr/select/?start=0&q="the dog runs"
I'm using an tokenized, stemmed textfiled with the following options:
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory"
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseC开发者_JAVA百科hange="1"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.SnowballPorterFilterFactory" language="German" />
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms-de.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.SnowballPorterFilterFactory" language="German" />
I have no idea, why it's not working. :-( ...thank you for any hint.
To answer my own question:
The analyzer on index time is using a stopwords list, while the analyzer on query time does NOT use a stopword list. So the phrase in the index was not the same as the phrase on query time.
I only had to add the StopFilterFactory at the "query"-analyzer.