I have a table like
date user_id page_id
2010-06-19 16:00:00 1 4
2010-06-19 16:00:00 3 4
2010-06-20 07:10:00 1 1
2010-06-20 12:00:10 1 2
2010-06-20 12:00:10 1 3
2010-06-20 13:05:00 2 1
2010-06-20 14:10:00 3 1
2010-06-21 17:00:00 开发者_JS百科 2 1
I want to write a query that will return the last page_id for those users who haven't visited in the last day.
So, I can find who hasn't visited in the last day with:
SELECT user_id, MAX(page_id)
FROM page_views GROUP BY user_id
HAVING MAX(date) < DATE_SUB(NOW(), INTERVAL 1 DAY);
However, how can I find the last viewed page_id for these users? i.e. I want to know which page_id corresponds to the value in the same row as MAX(date). In the case where there are multiple page views per date, I can just select the MAX(page_id).
The expected output from above should be (if NOW() returns 2010-06-21 18:00:00):
user_id page_id
1 3
3 1
- user_id 1 last visited over a day ago at 2010-06-20 12:00:10, and the MAX(page_id) was 3.
- user_id 2 last visited less than a day ago, so they are ignored.
- user_id 3 last visited over a day ago, and their most recent page_id was 1.
How can I achieve this? I need to use only SQL. I'm using a MySQL derivative that requires all columns in the SELECT clause to be declared in the GROUP BY clause (it's a little more standards compliant).
Thanks.
I could see different approaches. For example:
select a.user_id, a.page_id
from page_views a
inner join (SELECT user_id, MAX(date) as date
FROM page_views GROUP BY user_id
HAVING MAX(date) < DATE_SUB(NOW(), INTERVAL 1 DAY) ) b on a.user_id = b.user_id
and a.date = b.date
It could be implemented more effective in MS SQL or Oracle with windowed functions.
Another idea:
select a.user_id, a.page_id
from page_views a
where date < DATE_SUB(NOW(), INTERVAL 1 DAY)
and not exist(select 1 from page_views b
where a.user_id = b.user_id and b.date > a.date)
精彩评论