开发者

Mysql Query Optimization, Remove a 'NOT IN (SELECT CLAUSE)' using joins

开发者 https://www.devze.com 2023-03-19 19:51 出处:网络
I have 2 tables, called login_log, logs the timestamp of each email that 开发者_开发知识库logs into a website.The other table is called admin and contains administration rights.Both have the email as

I have 2 tables, called login_log, logs the timestamp of each email that 开发者_开发知识库logs into a website. The other table is called admin and contains administration rights. Both have the email as a unique identifier.

I want to get a list of all emails that have not logged-in in the past 90 days. The problem is the login_log table simply logs every email that logs in with the timestamp, it doesn't store a most recent log in just a list of times where the user logged in. So I can easily get a list of the users to keep and using the 'NOT' keyword those I don't want to keep. But it runs really slow using the 'NOT IN' syntax. So the below statement has a sub-query that grabs all emails in the last 90 days that I want to keep, and the outer grabs all the email I don't want.

SELECT distinct a.email FROM admin a WHERE a.email NOT IN (
    SELECT distinct a.email FROM admin a 
    INNER JOIN login_log ll ON a.email = ll.email AND 
    (ll.timestamp > UNIX_TIMESTAMP() - 7776000)  /* 90 days in seconds */
);

So my question is what would be a good method of changing this into a JOIN or some other optimized query?


This will return all emails without login in the last 90 days:

select distinct a.email, last_login
from admin a 
inner join (
    select email, max(timestamp) as last_login
    from login_log
    group by email
    ) ll
on a.email = ll.email
where last_login < unix_timestamp() - 7776000

An index on login_log.email would speed it up.

EDIT:

This could be faster:

select distinct a.email
from admin a 
left outer join (
    select email
    from login_log
    where timestamp >= unix_timestamp() - 7776000
    ) ll
on a.email = ll.email
where ll.timestamp is null


Try using HAVING:

SELECT distinct a.email FROM admin a
LEFT JOIN 
    (SELECT distinct a.email FROM admin a
     INNER JOIN login_log ll ON a.email = ll.email 
       AND (ll.timestamp > UNIX_TIMESTAMP() - 7776000)
    ) as tmp ON tmp.email = admin.email
HAVING tmp.email IS NULL;

Although this still has a sub-select, it is only calculated once, instead of once per record in admin. It should improve performance significantly.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号