开发者

Duplicate Entries MySQL with large data?

开发者 https://www.devze.com 2023-03-11 08:22 出处:网络
I\'m using the follow query to try and identify data from a table with around 10 million rows SELECT *

I'm using the follow query to try and identify data from a table with around 10 million rows

  SELECT *
FROM
  db.tablename x
JOIN db.tablename z
  ON x.columnA = z.columnA
WHERE
  x.columnB > z.columnB

The problem is that the table doesn't have a Primary Key because of the duplicated data in the value that is the primaryKey. The above query is hugely slow and there isn't any way I can figure out how to make it more effici开发者_运维百科ent.

Adding LIMIT 100 still doesn't seem to help ?

Any ideas ?


To find the duplicate values you could:

select   columnA
from     table
group by columnA
having   count(*) > 1

Depending on what you then want to do you could put this in a temporary table.

But it seems strange that you have a concept of identity (even if not 100% correct all the time) and don't have an index on the data - would you not want to do lookups using this field fairly often? Perhaps you could create a non-unique index on columnA, at least while you run your query

0

精彩评论

暂无评论...
验证码 换一张
取 消