I have 2 tables, called login_log, logs the timestamp of each email that 开发者_开发知识库logs into a website. The other table is called admin and contains administration rights. Both have the email as a unique identifier.
I want to get a list of all emails that have not logged-in in the past 90 days. The problem is the login_log table simply logs every email that logs in with the timestamp, it doesn't store a most recent log in just a list of times where the user logged in. So I can easily get a list of the users to keep and using the 'NOT' keyword those I don't want to keep. But it runs really slow using the 'NOT IN' syntax. So the below statement has a sub-query that grabs all emails in the last 90 days that I want to keep, and the outer grabs all the email I don't want.
SELECT distinct a.email FROM admin a WHERE a.email NOT IN (
SELECT distinct a.email FROM admin a
INNER JOIN login_log ll ON a.email = ll.email AND
(ll.timestamp > UNIX_TIMESTAMP() - 7776000) /* 90 days in seconds */
);
So my question is what would be a good method of changing this into a JOIN or some other optimized query?
This will return all emails without login in the last 90 days:
select distinct a.email, last_login
from admin a
inner join (
select email, max(timestamp) as last_login
from login_log
group by email
) ll
on a.email = ll.email
where last_login < unix_timestamp() - 7776000
An index on login_log.email would speed it up.
EDIT:
This could be faster:
select distinct a.email
from admin a
left outer join (
select email
from login_log
where timestamp >= unix_timestamp() - 7776000
) ll
on a.email = ll.email
where ll.timestamp is null
Try using HAVING:
SELECT distinct a.email FROM admin a
LEFT JOIN
(SELECT distinct a.email FROM admin a
INNER JOIN login_log ll ON a.email = ll.email
AND (ll.timestamp > UNIX_TIMESTAMP() - 7776000)
) as tmp ON tmp.email = admin.email
HAVING tmp.email IS NULL;
Although this still has a sub-select, it is only calculated once, instead of once per record in admin. It should improve performance significantly.
精彩评论