开发者

Name comparison algorithm

开发者 https://www.devze.com 2023-01-21 12:32 出处:网络
To check if a name is inside an anti-terrorism list. In addition of the given name, also search for similar names (possible aliases).

To check if a name is inside an anti-terrorism list.

In addition of the given name, also search for similar names (possible aliases).

Example:

given name => Bin Laden alert!

given name => Ben Larden mhm.. suspicious name, matchs at xx% with Bin Laden

How can I do this?

  • using PHP
  • names are 100% correct, since they are from official sources
  • i'm Italian, but i think this won't be a problem, since names are international
  • names can be composed of several words: Najmiddin Kamolitdinovich JALOLOV
  • looking for companies and people

I looked at differents algorithms: do you think that Levenshtein can do th开发者_Go百科e job?

thank you in advance!

ps i got some problems to format this text, sorry :-)


I'd say your best bet to get this working with PHP's native functions are

  • soundex() — Calculate the soundex key of a string
  • levenshtein() - Calculate Levenshtein distance between two strings
  • metaphone() - Calculate the metaphone key of a string
  • similar_text() - Calculate the similarity between two strings

Since you are likely matching the names against a database (?), you might also want to check whether your database provides any Name Matching Functions.

Google also provided a PDF with a nice overview on Name Matching Algorithms:

  • http://homepages.cs.ncl.ac.uk/brian.randell/Genealogy/NameMatching.pdf


The Levenshtein function (http://php.net/manual/en/function.levenshtein.php) can do this:

$string1 = 'Bin Laden';
$string2 = 'Ben Larden';
levenshtein($string1, $string2); // result: 2

Set a threshold on this result and determine if the name looks similar.

0

精彩评论

暂无评论...
验证码 换一张
取 消