Grouping fields that partially match in MySQL_问答_开发者

Grouping fields that partially match in MySQL

开发者 https://www.devze.com 2022-12-17 07:51 出处：网络

I\'m trying to return duplicate records in a user table where the fields only partially match, and the matchi开发者_开发技巧ng field contents are arbitrary. I\'m not sure if I\'m explaining it well, s

I'm trying to return duplicate records in a user table where the fields only partially match, and the matchi开发者_开发技巧ng field contents are arbitrary. I'm not sure if I'm explaining it well, so here is the query I might run to get the duplicate members by some unique field:

SELECT MAX(id)
FROM members
WHERE 1
GROUP BY some_unique_field
HAVING COUNT(some_unique_field) > 1

I want to apply this same idea to an email field, but unfortunately our email field can contain multiple e-mails seperated by a comma. For example, I want a member with his email set to "user@someaddress.com" to be returned as a duplicate of another member that has "user@someaddress.com","someotheruser@someaddress.com" in their field. GROUP BY obviously will not accomplish this as-is.

Something like this might work for you:

SELECT *
FROM members m1
inner join members m2 on m1.id <> m2.id
    and (
        m1.email = m2.email
        or m1.email like '%,' + m2.email
        or m1.email like m2.email + ',%'
        or m1.email like '%,' + m2.email + ',%'
    )

It depends on how consistently your email addresses are formatted when there are more than one. You might need to modify the query slightly if there is always a space after the comma, e.g., or if the quotes are actually part of your data.

This works for me; may not do what you want:

SELECT MAX(ID) FROM members WHERE Email like "%someuser%" GROUP BY Email HAVING COUNT(Email) > 1