开发者

How do I setup Lucene so that I can search ignoring whitespace characters?

开发者 https://www.devze.com 2023-01-19 13:06 出处:网络
For example, a list of part开发者_JAVA百科 numbers includes: JRB-1000 JRB 1000 JRB1000 JRB100-0 -JRB1000

For example, a list of part开发者_JAVA百科 numbers includes:

JRB-1000

JRB 1000

JRB1000

JRB100-0

-JRB1000

If a user searches on 'JRB1000', or 'JRB 1000' I would like to return a match for all the part numbers above.


Write a custom Analyzer that either splits these into several tokens (JRB, 1000; relatively easy and forgiving to users) or concatenates them into a single token (JRB1000; hard but precise). Implementing your own Analyzer amounts to overriding the tokenStream argument in an existing one and perhaps writing a custom TokenFilter class.

Apply your new Analyzer on both documents being indexed and queries.

(Links are for the Java version, but .NET should be similar.)

0

精彩评论

暂无评论...
验证码 换一张
取 消