开发者

Solr Search Across Multiple Cores

开发者 https://www.devze.com 2023-04-05 15:47 出处:网络
I have two Solr cores. Core0 imports data from a Oracle table called items. Each item has a unique id (item_id) and is either a video item or a audio item (item_type). Other fields contain searchabl

I have two Solr cores.

Core0 imports data from a Oracle table called items. Each item has a unique id (item_id) and is either a video item or a audio item (item_type). Other fields contain searchable texts (description, comments etc)

Core1 imports data from two tables (from a different database) called video_item_dates and audio_item_dates which record occurrence dates of an item in a specific market. The fields are item_id, item_market and dates. A single row would look like (item_001, 'Europe', '2011/08/15, 2011/08/17,2011/08/20). The unique key in these two database tables here is the combination of item_id and item_market. I have flattened data into a single index for Core1.

My problem now is searching both cores to produce a single result. A typical query would be like 'What are the items that have the word Hurricane in the description field and ran in North American market during the the month of August 2011?'. I could separate this query into two different queries and make them run against a different core and then merge the results. But given the fact each query may produce millions of rows that approach is very inefficient.

I tried the Solr Distributed Search. I created a third core (called Core2) with fields from Core0 and Core1. I added a request handler with shards attribute to the third core like this :

<requestHandler name="shard" class="solr.SearchHandler">
   <lst name="defaults">
      <str name="shards">localhost/solr/core0/,localhost/solr/core1/</str>
    </lst>
</requestHandler>

If I run a query against this third core, it forwards the query to both Core0 and Core1 and since neither of them have all the fields , one of them reports "undef开发者_如何学Cined field" and the response is a bad request error message.

Any help would be greatly appreciated.

Please note I have no control over the structure of the database tables.


This does not seem to be a case for multiple cores. You should look into designing a single schema that supports the desired search.


Sharding is used when the core gets hugh and tough to handle as a single entity. The cores would be broken in to smaller chunks and you can now search across the multiple cores. Usually they share the same configuration.

You would need to define the fields in both the cores to keep them in sync, so that you don't get the fields undefined error. The fields irrelevant to the cores would be blank, so should not affect.

Sharding doesn't require you a create a new core. You can work with core0 and core1. More on it @ http://wiki.apache.org/solr/DistributedSearch

Also check the limitations with distributed search.

If the sharding performance is not satisfactory to you, you can create a single core with both datasets or check the merge option which combines the cores into single core.


You can merge the indexes from the different cores into a new index using CoreAdmin:

http://wiki.apache.org/solr/MergingSolrIndexes

0

精彩评论

暂无评论...
验证码 换一张
取 消