开发者

Fuzzy matching API in a long list of queries

开发者 https://www.devze.com 2023-01-18 06:08 出处:网络
I have an application which lets people ask predefined queries. However, the list of such queries is too long. Hence, the current approach is to let users enter a word in the search box and then show

I have an application which lets people ask predefined queries. However, the list of such queries is too long. Hence, the current approach is to let users enter a word in the search box and then show them the likely matches from the list of queries. ( Very much like google's "Did you mean" feature.)

Is there an API in Java available for this? I should be able to supply the list of queries. The API should provide a fuzzy match capabilit开发者_如何学Cy, so that incorrect spellings do not matter. ( That is why an exact String matching algorithm is not sufficient)


The magic word here may be "regular expression" -- anything you can model as a finite state machine can be done with regular expressions.

Failing that, you might look into "digital search trees" or "tries".


Some of the API's i can suggest are:

  • Patricia Trie
  • Trie

Similar SO Questions:

  • How Does Google "Did you mean" algorithm work ?
  • Where can I learn about google did you mean ?


Perhaps a probabilistic algorithm using Soundex or a derivative would work? http://en.wikipedia.org/wiki/Soundex


Found these Java implementation of Peter Norvig's spell correction algorithm. A bit dated, but good for getting started.

  1. Spelling Corrector
  2. jSpellCorrect
0

精彩评论

暂无评论...
验证码 换一张
取 消