开发者

How to determine which string in an array is most similar to a given string?

开发者 https://www.devze.com 2023-02-06 17:44 出处:网络
Given a string, str开发者_如何学JAVAing name = \"Michael\"; I want to be able to evaluate which string in array is most similar:

Given a string,

str开发者_如何学JAVAing name = "Michael";

I want to be able to evaluate which string in array is most similar:

string[] names = new[] { "John", "Adam", "Paul", "Mike", "John-Michael" };

I want to create a message for the user: "We couldn't find 'Michael', but 'John-Michael' is close. Is that what you meant?" How would I make this determination?


This is usually done using the Edit distance / Levenshtein distance by comparing which word is the closest based on the number of deletions, additions or changes required to transform one word into the other.

There's an article providing you with a generic implementation for C# here.


Here you have the results for your example using the Levenshtein Distance:

EditDistance["Michael",#]&/@{"John","Adam","Paul","Mike","John-Michael"}
{6,6,5,4,5}  

Here you have the results using the Smith-Waterman similarity test

SmithWatermanSimilarity["Michael",#]&/@{"John","Adam","Paul","Mike","John-Michael"}
{0.,0.,0.,2.,7.} 

HTH!

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号