开发者

Is there a faster (less precise) algorithm than Levenshtein for string distance?

开发者 https://www.devze.com 2023-03-09 02:12 出处:网络
I want to run the Levenshtein, but WAY faster because it\'s real 开发者_如何转开发time application that I\'m building. It can terminate once the distance is greater than 10.Judging from comments, peop

I want to run the Levenshtein, but WAY faster because it's real 开发者_如何转开发time application that I'm building. It can terminate once the distance is greater than 10.


Judging from comments, people seem to be pretty happy with Sift3.

http://sift.codeplex.com


The Levenshtein distance metric allows addition, deletion or substitution operations. If you're looking for a faster but less precise metric you can use the longest common subsequence (allows only addition and deletion) or even the Hamming distance (allows only substitution).

However, I recommend that you try to optimize your Levenshtein distance algorithm instead as it gives the best results.


If you want to compare UTF-8 contents use sift4:

https://siderite.dev/blog/super-fast-and-accurate-string-distance.html

Also I prepared a jsPerf which shows the performance difference between those libraries: http://jsperf.com/levenshtein-perf

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号