Similar questions were indeed asked, but I didn't find an answer.
I have a MySql table with 3 non-unique fields. I don't want duplicate rows. Meaning ("a", "b", "c")
and ("a", "dasd",开发者_如何转开发 "dfsd")
are okay (I don't mind having "a" twice in the first fields), but having ("a", "b", "c")
twice is wrong.
I need a query which will remove duplicates, leaving only one row for each row group.
Edit This has already been covered on SO before.
One approach would be to create a new table based on the existing table. You could do this through something like:
create table myNewTable SELECT distinct * FROM myOldTable;
Then you could clear the old table's data, and create a unique constraint on the fields you don't want duplicated:
TRUNCATE TABLE myOldTable;
ALTER TABLE myOldTable
ADD UNIQUE (field1, field2);
Then insert your data back into the original table. Because you created myNewTable
using DISTINCT
, you should not have any duplicates.
INSERT INTO myOldTable SELECT * FROM myNewTable;
Note: It assumes we have primary key apart from column1 and column2 and column3. Also it assumes that last row should be preserved. Helpful when we have some other information also apart from column1,column2 and column3.
It saves the last primary key and delete the rest for unique values of Column1,Column2,Column3
Insert result of below query into a temp table
SELECT MAX(PrimaryKey)
FROM TABLENAME
GROUP BY Column1,Column2,Column3
Delete from TABLENAME where PrimaryKey NOT IN (SELECT PrimaryKey FROM TEMPTABLE)
If we have only these 3 columns, then
- Save distinct in temp table
- truncate original table
- insert back into original from temp table.
You can retrieve a list of the duplicates like this:
SELECT field1, field2, field3, count(*) AS cnt
FROM yourtable
GROUP by field1, field2, field3
HAVING (cnt > 1)
You'll then have to delete the duplicate rows in subsequent seperate queries.
I will solve the problem by using a temporary table and subqueries to find the elements to erase. That will only work if your table 'yourTable' with the fields f1,f2,f3 has also an ID field that is unique.
Create the temporary table to store the IDs of the elements to erase.
CREATE TEMPORARY TABLE ids (ID int);
Find the IDs of the elements to erase:
INSERT INTO ids(ID) SELECT ID FROM yourTable AS t
WHERE 1 != (SELECT COUNT(*) FROM yourTable
WHERE yourTable.ID <= t.ID
AND yourTable.f1 = t.f1
AND yourTable.f2 = t.f2
AND yourTable.f3 = t.f3);
Delete the elements of the table with the previously selected indexes
DELETE yourTable FROM yourTable,ids WHERE yourTable.ID = ids.ID;
Remove the temporary table
DROP TABLE ids;
If SQL supported to to subqueries using the same table for a SELECT and a DELETE we could do all that in the same query, but this is not the case, so we need to go through a temporary table.
To avoir duplicates to happen I will set the three fields as primary keys of the table, in this way:
ALTER TABLE yourTable ADD PRIMARY KEY (f1, f2, f3);
You will be able to alter your table this way, only when you removed all the duplicates and once the table altered subsequent inserts with duplicated values will fail.
精彩评论