开发者

MS SQL - WHERE substring matches are phonetic?

开发者 https://www.devze.com 2023-02-28 22:33 出处:网络
I\'d like to make a search feature that searches based on \"sounds like\" match. For instance, lets say I have a company list that looks like this (lets say we live in Bizzaro world too):

I'd like to make a search feature that searches based on "sounds like" match.

For instance, lets say I have a company list that looks like this (lets say we live in Bizzaro world too):

  • Acme
  • Already allusion cite LTD
  • All ready illusion site INC
  • Apart assent
  • Assent sight

(Or something simmilar with names... George or Jeorge ? "Yah-way", or "ye-hova" ?)

When someone searches for something that "sounds like" the soundex("site") == S230, they should see results for "Sight" also.

As most people who've used soudnex before already know, normal substring matches obviously don't do this.

I'm trying to work out in my head how to make a WHERE clause that can match based on this, so instead of a typical WHERE company LIKE input, I'd like to run a soundex. Obviously if I run soundex on the whole company name, I won't be able to do substring searching (for example, a user searching "ALL" will never match a soundex of "All ready"). Soundex split on each word might not be worthwhile either, so I'm not sure running all combinations of a soundex is a good idea... or even if that's going to be computationally feasible in a database with more than 1000 records.

Basically the interaction I want to have is when (in an office or something) Tom says to Sally "That name was something like Rebekkah Schwartzkopff" and it can be searched phonetically for a fuzzy match.

Obviously we're going to run into issues with non-English named companies because of soudnex, but I'm will to compromise on this one.

I'd like to do this without adding anything to 开发者_StackOverflow社区the database, or a stored procedure.


If SOUNDEX is a good beginning for what you are doing, you can use DIFFERENCE.

eg:

SELECT *
FROM Person
WHERE DIFFERENCE(Person.FirstName, 'George') >= 3

Note that the DIFFERENCE function returns the difference between the SOUNDEX values of two strings using a value of 0-4; 4 meaning the strings are pretty close to the same and 0 meaning they are completely different (kind of a backwards scale to me, but I suppose it works).


Very interesting question. I did a little poking around and found this:

http://www.codeproject.com/KB/database/dmetaphone4.aspx

I haven't tested it myself but it seems like it would be worth checking out.

It would require you to add something to the database, but I don't see how you can implement the functionality you want with built in SQL Server functionality...

0

精彩评论

暂无评论...
验证码 换一张
取 消