开发者

search japanese characters (utf-8 encoded ) using Sqlite FTS

开发者 https://www.devze.com 2023-03-17 20:04 出处:网络
It seems that Sqlite FTS don\'t support searching Japanese characters according to my experiments , and discussion here.

It seems that Sqlite FTS don't support searching Japanese characters according to my experiments , and discussion here.

#select * from tblEvent_shortdes where short_des MATCH   'BSジャパンの見どころ' 
#return nothing
select * from tblEvent_shortdes where short_des MATCH  'パンの見' 

Customize tokenizer in FTS seems to be the way to accomplish this but I did not found any promising open sourced tokenizer for Jap开发者_JAVA技巧anese. Will ICU tokenizer do?


You might take a look at ChaSen and MeCab. It has been several years since I used either - and it looks as though neither has been updated recently - but both proved adequate at Japanese tokenization.

0

精彩评论

暂无评论...
验证码 换一张
取 消