开发者

Lucene query returns things I'm not expecting

开发者 https://www.devze.com 2023-01-23 21:04 出处:网络
I\'m querying a Lucene index file which structure I didn\'t build. This index contains Documents with fields structured like this:

I'm querying a Lucene index file which structure I didn't build. This index contains Documents with fields structured like this:

Lucene query returns things I'm not expecting

As you can see the 'type' field is always empty, however the 'all' field contains data formatted in a way so that is searchable and开发者_运维知识库 it contains a type=ta sort of syntax.

The weird thing is that when I query this index using type:ta it actually outputs something even though the type field is always empty.

What's happening here?

EDIT

After googling a bit more I found out a weird concept (at least for me, coming from SQL database background) that data can be stored (Store.YES and Store.NO) in different ways . Lucene indexing: Store and indexing modes explained

This is a very unusual concept for me as I don't find many reasons to NOT store data. What's the reason behind using Store.NO? I will most likely always want to have the data there even though I'm not displaying it anywhere... I mean if data is indexed it must be stored anyhow, right?


What's the reason behind using Store.NO?

Consider the queries:

  1. What documents contain the term 'foo'?
  2. What terms does document '1234' contain?

An index for the first will map term -> document. The second will map document -> term. Most people only want to use Lucene for the first type of query, so they only build the first type of index (Store.NO). If you want to do the second type of query, you'll need to build both types of indices. This takes up more space. (It is in theory possible to loop through all terms and figure out the document without actually building this index, but it's really slow.)

"Reverse index" might be a more appropriate name than "store."


what lucene query syntax: there are a lot of steering chars

try

type:'ta'

quoted thoe ..

0

精彩评论

暂无评论...
验证码 换一张
取 消