开发者

Full Text Search in SQL Server 2008 shows wrong display_item for Thai language

开发者 https://www.devze.com 2022-12-25 20:40 出处:网络
I am working with SQL Server 2008. My task is to investigate the issue where FTS cannot find the right result for Thai.

I am working with SQL Server 2008. My task is to investigate the issue where FTS cannot find the right result for Thai.

First, I have the table which enables the FTS on the column 'ItemName' which is nvarchar. The Catalog is created with the Thai Language. Note that the Thai language is one of the languages that doesn't separate the word by spaces, so 'หลวง' 'พ่อ' 'โสธร' are written like this in a sentence: 'หลวงพ่อโสธร'

  • In the table, there are many rows that include the word (โสธร); for example row#1 (ItemName: 'หลวงพ่อโสธร')
  • On the webpage, I try to search for 'โสธร' but SQL Server cannot find it.

So I try to investigate it by trying the f开发者_如何学编程ollowing query in SQL Server:

select * from sys.dm_fts_parser(N'"หลวงพ่อโสธร"', 1054, 0, 0)

...to see how the words are broken. The first one is the text to be broken. The second parameter is to specify that we're using Thai (WorkBreaker, so on). Here is the result:

  • row#1 (display_item: 'ງลวง', source_item: 'หลวงพ่อโสธร')
  • row#2 (display_item: 'พຝโส', source_item: 'หลวงพ่อโสธร')
  • row#3 (display_item: 'ธร', source_item: 'หลวงพ่อโสธร')

Notice that the first and second row display the wrong display_item 'ງ' in the 'ງลวง' isn't even Thai characters. 'ຝ' in 'พຝโส' is not a Thai character either.

So the question is where did those alien characters come from? I guess this why I cannot search for 'โสธร' because the word breaker is broken and keeping the wrong character in the indexes.

Please help!


This should be due to the different Dialect of thai selected while the indexing was applied. From FTS properties check what is your selected language / culture

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号