开发者

Strange results of full-text MySQL search, need an explanation

开发者 https://www.devze.com 2023-01-13 16:41 出处:网络
my SQL query SELECT keyword FROM table WHERE MATCH (keyword) AGAINST (\'eco*\' IN BOOLEAN MODE); matches cells with these words:economy,ecology,echoscopy(why?),echo(why?) etc.

my SQL query

SELECT keyword

FROM table

WHERE MATCH (keyword)

AGAINST ('eco*' IN BOOLEAN MODE);

matches cells with these words:economy,ecology,echoscopy(why?),echo(why?) etc.

other SQL query

SELECT keyword

FROM table

WHERE MATCH (keyword)

AGAINST ('eci*' IN BOOLEAN MODE);

matches cell with the word:echidna.

Yet both queries don't match word ectoplasm开发者_如何学Go.

Why does echo, echoscopy match 'eco*' and echidna matches 'eci*'?

I see key element in this problem being letter combination "ch".

Why does it work this way and how can I avoid this kind of matching?


The problem (feature?) was in the collation. "c" and "ch" were treated equal because of utf8_lithuanian_ci collation.

Edit:

Changing collation to utf8_unicode_ci fixes only certain issues.

The real solution is to use utf8_bin, which matches binary values of each character, meaning it's:

  • case sensitive
  • diacritics sensitive


The reason that it matches is that MATCH ... AGAINST is using regular expressions, and the * means, that the preceding char ("o") can be there from 0 to 9999999999999999999^ times. What you meant to match is

eco.*

Will match "eco" and "ecology" but not "echo".

eco.+

Will match "ecology" and "eco system" but not "eco" nor "echo".


Maybe you can try this

SELECT keyword FROM table WHERE keyword LIKE 'eco%';
0

精彩评论

暂无评论...
验证码 换一张
取 消