I have a column 'names' and 'ids' in table "OG" and want to find those names where the last letter is different and the total edit distance is two. So far I have:
SELECT
z1.names as names1, z2.names as names2, z1.ids, z2.ids
FROM (SELECT t.names, SUBSTRING(t.names for Length(t.names-1) AS newnames
from "OG" t) z1, (SELECT r.names, SUBSTRING(r.names for Length(r.names-1) AS
newnames1 FROM "OG" r) z2
WHERE levenshtein(z1.newnames, z2.newnames1) = 2 AND z1.id != z2.id
Unfortunetly, this doesn't ensure the last letters are开发者_运维问答 different. Any ideas for a fix?
Check the last characters as well:
WHERE levenshtein(z1.newnames, z2.newnames1) = 2 AND z1.id != z2.id
AND substring(z1.names,Length(z1.names)) <> substring(z2.names,Length(z2.names))
Note that using SUBSTRING(t.names for length(t.names)-1)
in your query will fail when the string is empty (not null)
精彩评论