开发者

SQL query for non duplicate records

开发者 https://www.devze.com 2023-01-15 09:54 出处:网络
I\'m attempting t开发者_Go百科o build a query that will return all non duplicate (unique) records in a table.The query will need to use multiple fields to determine if the records are duplicate.

I'm attempting t开发者_Go百科o build a query that will return all non duplicate (unique) records in a table. The query will need to use multiple fields to determine if the records are duplicate.

For example, if a table has the following fields; PKID, ClientID, Name, AcctNo, OrderDate, Charge, I'd like to use the AcctNo, OrderDate and Charge fields to find unique records.

Table

PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
1        JX100        John     12345      9/9/2010      $100.00
2        JX220        Mark     55567      9/9/2010       $23.00
3        JX690        Matt     89899      9/9/2010      $218.00
4        JX100        John     12345      9/9/2010      $100.00

The result of the query would need to be:

PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
2        JX220        Mark     55567      9/9/2010       $23.00
3        JX690        Matt     89899      9/9/2010      $218.00

I've tried using SELECT DISTINCT, but that doesn't work because it keeps one of the duplicate records in the result. I've also tried using HAVING COUNT = 1, but that returns all records.

Thanks for the help.


HAVING COUNT(*) = 1 will work if you only include the fields in the GROUP BY that you're using to find the unique records. (i.e. not PKID, but you can use MAX or MIN to return that since you'll only have one record per group in the results set.)


SELECT   MAX(PKID)     AS PKID    ,
         MAX(ClientID) AS ClientID,
         MAX(Name)     AS Name    ,
         AcctNo                   ,
         OrderDate                ,
         Charge
FROM     T
GROUP BY AcctNo   ,
         OrderDate,
         Charge
HAVING   COUNT(*) = 1

or

SELECT PKID      ,
       ClientID  ,
       Name      ,
       AcctNo    ,
       OrderDate ,
       Charge
FROM   YourTable t1
WHERE  NOT EXISTS
       (SELECT *
       FROM    YourTable t2
       WHERE   t1.PKID     <> t2.PKID
       AND     t1.AcctNo    = t2.AcctNo
       AND     t1.OrderDate = t2.OrderDate
       AND     t1.Charge    = t2.Charge
       )


Simply add:

GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1

The GROUP BY groups all rows with the same AcctNo, OrderDate and Charge together, then the HAVING COUNT(1) = 1 shows only the rows where there was just 1 progenitor.


Thanks kekekela for the nudge in the right direction.

Here's the query that produced the result I wanted:

SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING (COUNT(AcctNo) = 1) AND (COUNT(OrderDate) = 1) AND (COUNT(Charge) = 1);

Or more simplified based on Gus's example:

SELECT AcctNo, OrderDate, Charge FROM Table1 GROUP BY AcctNo, OrderDate, Charge
HAVING COUNT(1) = 1;


You could just drop the PKID to return all records:

SELECT DISTINCT 
           ClientID
         , Name
         , AcctNo
         , OrderDate
         , Charge
FROM       table;

Note: This is slightly different from what you're asking.
It returns a unique set by removing the one non-unique field.
By your example, you're asking to return non-duplicates.

I could only see your example being useful if you're trying
to clean up a table by extracting the "good" records.


Use window function for count and then you don't have to aggregate fields

select * from
(SELECT *,
count(*) over (partition by CLIENTID) as [Count]
from Table)
where Count=1


You could determine the non-unique records first, and then test for those records not in that set - like this

select * from mytable where pkid not in
(select t1.pkid 
from mytable t1 inner join mytable t2
on t1.pkid <> t2.pkid
and t1.acctno = t2.acctno
and t1.orderdate = t2.orderdate
and t1.charge = t2.charge)

the last part of the inner query lets you fiddle with the criteria for "equality" - add the required number of columns to test. Of course, this gets a lot more interesting without that primary key :) In such cases I usually end up creating one

Ketil


 SELECT GMPS.gen.ProductDetail.PaperType, GMPS.gen.ProductDetail.Size FROM
 GMPS.gen.ProductDetail GROUP BY GMPS.gen.ProductDetail.PaperType,
 GMPS.gen.ProductDetail.Size
 HAVING COUNT(1) = 1;
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号