开发者

Solr3.2 Carrot2 Clustering nothing but "Other Topics"

开发者 https://www.devze.com 2023-03-16 02:11 出处:网络
it is said that the Carrot integration into Solr was improved since the release of Solr 3.2 but it turns out to be different for me. I had a absolutly same configurated Solr 1.4.1 Server running were

it is said that the Carrot integration into Solr was improved since the release of Solr 3.2 but it turns out to be different for me. I had a absolutly same configurated Solr 1.4.1 Server running were Carrot was working great and Solr 3.2 just gives me nothing but "other topics". This ist driving me crazy because beside I get no exceptions or anything unusual. Even the result xml looks the same...

However I didn't make many changes to the standard configuration of the clustering component:

 <searchComponent name="clustering" 
                   enable="${solr.clustering.enabled:true}"
                   class="solr.clustering.ClusteringComponent" >
    <lst name="engine">
      <str name="name">default</str>

      <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>

      <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
          <!--custom-->
      <str name="LingoClusteringAlgorithm.phraseLabelBoost">8.00</str>
      <str name="TermDocumentMatrixBuilder.titleWordsBoost">6.00</str>


      <str name="carrot.lexicalResourcesDir">clustering/carrot2</str>

      <str name="MultilingualClustering.defaultLanguage">ENGLISH</str>
    </lst>
    <lst name="engine">
      <str name="name">stc</str>
      <str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str>
    </lst>
  </searchComponent>
  <requestHandler name="/clustering"
                  startup="lazy"
                  enable="${solr.clustering.enabled:true}"
                  class="solr.SearchHandler">
    <lst name="defaults">
      <bool name="clustering">true</bool>
      <str name="clustering.engine">default</str>
      <bool name="clustering.results">true</bool>
       <str name="carrot.title">autocomplete</str>
       <str name="carrot.url">autocomplete</str>
       <str name="carrot.snippet">autocomplete</str>
       <bool name="carrot.outputSubClusters">true</bool>

       <str name="defType">edismax</str>
       <str name="qf">
          text^0.5 autocomplete^1.2 ata^1.0 raum^1.0 system^1.0 assy^1.0 unit^1.0
       </str>
       <str name="q.alt">*:*</str>
       &开发者_如何转开发lt;str name="rows">10</str>
       <str name="fl">*,score</str>
    </lst>     
    <arr name="last-components">
      <str>clustering</str>
    </arr>
  </requestHandler>

My best guess was that carrot is not woking properly together with edismax (which wasn't implemented in Solr 1.4.1) but that might be missleading.

I allready reindexed my data just to make sure that this is not the issue.

In the carrot workbench clustering is working well with Lingo as the algorithm. when I chose "by source" I get "other topics" as in the xml. Might Lingo be not configured well? Do have to configure anything besides solrconfig.xml to fix that?

I'm thankful for any help.


This happens if the 'snippet' you are trying to cluster on never differs or differs very little. Try adding 'clustering.snippet=' to your request parameters. In your settings there is a field called 'autocomplete' that it defaults to. Does this field have any meaningful text?

Example that makes this behaviour go away for me:

http://localhost:8983/solr/clustering?q=peter&rows=200&carrot.snippet=summary

Best regards,

/Peter W

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号