I have to find, if there is any row in the table for which value of col2 is duplicate. Like in the specified scenario row 1 and 2 are duplicate as value of col2 is same.
Table
-------------------
Col1 Col2 Col3
1 1 4
2 1 3
3 2 2
4 3 1
At present what i am doing i开发者_如何学Pythons
select count(*) from table group by col2 order by 1 desc
if the value of first row is > 1 then there are duplicates.
Please specify any optimized query for this problem.
Note The table contains trillions of data, and there is no index on col2 [ if that matters to you ]
select * from MyTable where col2 in
(
select col2
from MyTable
group by col2
having count(*) > 1
) x
I think the following would give the right results in ms sql server... maybe in DB2 also
select * from t where col2 in (select col2 from t group by col2 having count(*) > 1)
@Dave K is absolutely right.
You could also do it this way.
I suspect most optimisers will do exactly the same for both queries.
SELECT *
FROM MyTable t1
( SELECT col2
FROM MyTable
GROUP BY col2
HAVING count(*) > 1
) t2
WHERE t1.col2 = t2.col2
If you need to output the results of the duplicate records, you can try the following.
select * from
(
select *
,row_number() over (partition by col2, order by col1, col2) r
from MyTable
)dt
where r = 2
精彩评论