开发者

Mysql NOT IN operator performance problems for large result sets?

开发者 https://www.devze.com 2022-12-11 02:49 出处:网络
I have the following two queries: selectcount(*) fromsegmentation_cycle_recipients scr , segmentation_instance si

I have the following two queries:

select   count(*) 
from     segmentation_cycle_recipients scr
         , segmentation_instance si 
where    si.access_code=scr.access_code 
         and si.segment_id is NOT NULL; 

Returns 13429 rows in 0.2 secs

2)

select   count(*) 
from     segmentation_cycle_recipients scr
        , segmentation_instance si, web_pat_info wpi 
where    si.access_code=scr.access_code and scr.siebel_row_id=wpi.siebel_id 
         and si.segment_id is NOT NULL; 

Returns 4003 rows in 0.48 secs

Now, I want 1)-2) So I wrote the following query:

select 开发者_StackOverflow中文版  count(*) 
from     segmentation_cycle_recipients scr
         , segmentation_instance si 
where    si.access_code=scr.access_code 
         and si.segment_id is NOT NULL 
         and scr.siebel_row_id NOT IN (select scr.siebel_row_id 
from     segmentation_cycle_recipients scr
         , segmentation_instance si
         , web_pat_info wpi where si.access_code=scr.access_code 
        and scr.siebel_row_id=wpi.siebel_id and si.segment_id is NOT NULL); 

I was expecting 13429-4003=9426 rows but the query takes forever (have to kill the query command) to execute. It even adds a counter in the "slow queries" listing in mysql>status;)

It returns < 100ms in development environment where the result set is much smaller. So I believe the query itself is right.

I believe, using NOT IN is a known performance problem in Mysql (Oracle has the MINUS operator). Any suggestions on how to improve the performance on this query?


SELECT  COUNT(*)
FROM    segmentation_cycle_recipients scr
JOIN    segmentation_instancs si
ON      si.access_code = scr.access_code
LEFT JOIN
        web_pat_info wpi 
ON      wpi.siebel_id = scr.siebel_row_id
WHERE   wpi.siebel_id IS NULL
        AND si.segment_id is NOT NULL

Make sure that si.access_code and wpi.siebel_id are indexed, and wpi.siebel_id is defined as NOT NULL.

If the latter condition does not hold, replace wpi.siebel_id IS NULL in the WHERE clause with any other column defined as NOT NULL.


You may be better served using the NOT EXISTS clause.

select   count(*) 
from     segmentation_cycle_recipients scr
         , segmentation_instance si 
where    si.access_code=scr.access_code 
         and si.segment_id is NOT NULL 
         and NOT EXISTS (select scr2.siebel_row_id 
from     segmentation_cycle_recipients scr2
         , segmentation_instance si2
         , web_pat_info wpi2 where si2.access_code=scr2.access_code 
        and scr2.siebel_row_id=wpi2.siebel_id and si2.segment_id is NOT NULL
        and scr.siebel_row_id=scr2.siebel_row_id);
0

精彩评论

暂无评论...
验证码 换一张
取 消