I have a query as follows:
SELECT * FROM
(SELECT * FROM persons ORDER BY date DESC) AS p
GROUP BY first_name,last_name,work_ph开发者_运维技巧one
If you hadn't figured it out already, this removes entries with duplicate names and work phone numbers, leaving only the most recent. There is another field in the person table you should know about, a binary field called DELETED.
The problem is, if there is a duplicate of this nature, I don't want a row to be considered if its DELETED value is TRUE regardless of how recent its date value is. However, if a row has no duplicates it should be included in the results no matter what DELETED value it has.
If duplicates exist there is never a case where all duplicates have DELETED = TRUE, at least one will not be deleted.
SELECT * FROM
(SELECT * FROM persons ORDER BY deleted ASC, date DESC) AS p
GROUP BY first_name,last_name,work_phone
Here's how I understand the problem:
- You have a
persons
table with fieldsfirst_name
,last_name
,work_phone
,deleted
, and a bunch of others. - Any records that have the same first and last name, as well as work phone should be considered duplicates.
- The most recent undeleted duplicate should be used, or just the most recent if they've all been deleted.
Here's a rough sketch of how I would approach the problem:
- Select the distinct first_name, last_name, and work_phone values in a subquery.
- Left join that to the most recent undeleted record for each combination in another subquery.
- Left join that to the most recent record for each combination in a third subquery.
- Use
coalesce
to pull out the values from the second or third subquery, whichever is not null..
精彩评论