Determine the probability of numeric typing error_问答_开发者

Determine the probability of numeric typing error

开发者 https://www.devze.com 2023-02-16 12:00 出处：网络

I have: Correct numerical ID such as Phone number / Social-security number / etc. Another number, from some data-entry form

I have:

Correct numerical ID such as Phone number / Social-security number / etc.
Another number, from some data-entry form

The 2nd number is similar, but not equal to the 1st number. Both numbers are valid.

I want to calculate how probable it is that the 2nd number is actually a typing error of the 1st number.

Such errors may include:

Off by a few digits
Transposed digits
Mis开发者_开发百科interpreted digits (1-7, 4-9, 3-8, 2-5)

Does anyone know about existance of such algorithm / code?

Edit:

I'm not looking for a general string-similarity algorithm. I'm looking for an algorithm optimized for human number-entry typing errors, or for some research about this topic.

There are several algorithms to measure a string similarity.

You could implement some variant of the Levenshtein distance or Damerau-Levenshtein distance that rates the types of errors differently.

Treat the numbers as a sequence of digits and Calculate the similarity ratio between the two numbers. 2.0*M / T. Where T is the number of digits in both numbers M is the number of matches in the 2 numbers

a similarity ratio of 0.6 and above means the 2 numbers are similar

Note that the ratio is 1 if the numbers are identical, and 0 if they have no digit in common.