I have 2 MySQL tables: t1 and t2 which are 1M and 15M rows respectively. Table t1 only has 1 field: 'tel' and t2 has a lot of fields but also has a 'tel' field. What I want to do is quite simple: delete all the rows in t1 that exists in t2:
DELETE FROM t1 WHERE t1.tel IN (SELECT tel FROM t2)
The problem is that this query seems not to finish. I let it running in an 8 core Xeon workstation and after 2 days I decided to stop it and look for alternatives. I also tried to create a new table (tt1) and use LEFT OUTER JOIN to insert only开发者_运维知识库 the rows from t2 that are not in t1 but it seems to take the same amount of time. The 'tel' field in t1 is primary key and it's unique key in t2 (I also tried a CREATE INDEX t2tel ON t2(tel) but it didn't help).
Any suggestion? I'm considering writing a C# program to load both tables into ordered arrays or hashes and do it by code... Thanks in advance.
DELETE t1
FROM t1
INNER
JOIN t2
ON t1.tel = t2.tel;
That should be significantly faster than using a subquery. There are quite a lot of steps you could take to optimize your MySQL instance, if it's not already optimized, for large tables. Ample key buffers are a good start. There are plenty of other steps, you'd be best off hitting the Google for MySQL performance tuning.
the problem you have with performance I think it is because you are using a query inside a query, you better use joins, I made a test with 2 simple and small tables, and I used this:
DELETE t1 FROM t1 inner join t2 on t1.id = t2.t1_id;
It worked for me, I hope this could help you.
精彩评论