I want to find word even this word开发者_如何学C is written with skip letter.
For example I want to find
references
I want also find refrences or refernces, but not refer
I write this Regexp
(\brefe?r?e?n?c?e?s?\b)
And I want to add checking for length of matched group, this group should be greather than 8. Can I do only with regexp methods?
I don't think regex is a good tool to find similar words like you try to. What are you doing if two letters are swapped, like "refernece"? Your regex will not find it.
But to show the regex way to check for the length, you could do this by using a lookahead like this
(\b(?=.{8,}\b)refe?r?e?n?c?e?s?\b)
The (?=.{8,}\b)
will check if the length from the first \b
to the next \b
is at least 8 characters ({8,}
)
See it here on Regexr
I think that using regex is not a good idea. You need more power functions. For example, if you are programming in php, you need function like similar_text
. More details here: http://www.php.net/manual/en/function.similar-text.php
Basically you are asking that (in pseudo code):
input == "references" or (levenshtein("references", input)==1 and length(input) == (lenght("references")-1))
Levenshtein distance is defined as the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character.
Since you want to detect only the strings where a char was skipped, you must add the constraint on the string length.
精彩评论