开发者

MySQL difference in two columns

开发者 https://www.devze.com 2023-03-03 09:05 出处:网络
I h开发者_JS百科ave a table with two columns A and B, they contain values of same domain, let\'s say I want to find the values that only appears in A but not in B, i.e

I h开发者_JS百科ave a table with two columns A and B, they contain values of same domain, let's say I want to find the values that only appears in A but not in B, i.e

A has 1,2,3 B has 2,3

so my answer will be A - B = {1}

so the SQL query I wrote is

SELECT DISTINCT(A) 
  FROM DB
 WHERE A NOT IN (SELECT DISTINCT(B) 
                   FROM DB);

It doesn't work and taking forever to response (8000 records approx), does anyone have a solution similar to the above? Thanks.

I found some thread e.g MySQL: difference of two result sets but doesn't really do what I want to do thou.


How does the below perform? If you're getting poor performance, ensure you've indexes on both columns.

SELECT DISTINCT DB1.A
FROM            DB DB1
LEFT JOIN       DB DB2
ON              DB2.B = DB1.A
WHERE           DB2.B IS NULL


SELECT DISTINCT d1.A 
FROM DB d1 LEFT JOIN DB d2 ON d1.A=d2.B 
WHERE de.B IS NULL

ALTER TABLE DB ADD INDEX a_idx(A);

ALTER TABLE DB ADD INDEX b_idx(B);


When working with MySQL you have to remember that it has no ways of doing this query fast without indexes, because it does not support :

  • hash aggregation
  • hash joins
  • merge joins
  • hash IN() checks
  • etc
  • do not think about doing this :
    EXPLAIN ANALYZE SELECT DISTINCT a FROM t WHERE a NOT IN (SELECT b FROM t);
    HashAggregate  (actual time=7.283..7.428 rows=827 loops=1)
    ->  Seq Scan on t (actual time=4.159..6.778 rows=1697 loops=1)
        Filter: (NOT (hashed SubPlan 1))
          SubPlan 1
          ->  Seq Scan on t  (actual time=0.006..1.497 rows=8000 loops=1)
     Total runtime: 6.840 ms

Also MySQL (depending on the version) is more or less braindead regarding subqueries, although it tend to get better.

  • create an index on column A
  • create an index on column B

Try :

SELECT DISTINCT t1.a 
FROM table t1 
LEFT JOIN table t2 ON (t1.a=t2.b)
WHERE t2.b IS NULL

Or :

SELECT DISTINCT a 
FROM table t1 
WHERE NOT EXISTS( 
 SELECT * FROM t2 WHERE t2.b=t1.a
)
0

精彩评论

暂无评论...
验证码 换一张
取 消