At my Drupal website users can rate each other and those timestamped ratings are stored in the pref_rep table:
# select id, nice, last_rated from pref开发者_C百科_rep where nice=true
order by last_rated desc limit 7;
id | nice | last_rated
------------------------+------+----------------------------
OK152565298368 | t | 2011-07-07 14:26:38.325716
OK452217781481 | t | 2011-07-07 14:26:10.831353
OK524802920494 | t | 2011-07-07 14:25:28.961652
OK348972427664 | t | 2011-07-07 14:25:17.214928
DE11873 | t | 2011-07-07 14:25:05.303104
OK335285460379 | t | 2011-07-07 14:24:39.062652
OK353639875983 | t | 2011-07-07 14:23:33.811986
Also I keep the gender of each user in the pref_users table:
# select id, female from pref_users limit 7;
id | female
----------------+--------
OK351636836012 | f
OK366097485338 | f
OK251293359874 | t
OK7848446207 | f
OK335478250992 | t
OK355400714550 | f
OK146955222542 | t
I'm trying to create 2 Drupal blocks displaying "Miss last month" and "Mister last month", but my question is not about Drupal, so please don't move it to drupal.stackexchange.com ;-)
My question is about SQL: how could I find the user with the highest count of nice - and that for the last month? I would have 2 queries - one for female and one for non-female.
Using PostgreSQL 8.4.8 / CentOS 5.6 and SQL is sometimes so hard :-)
Thank you! Alex
UPDATE:
I've got a nice suggestion to cast timestamps to strings in order to find records for the last month (not for the last 30 days)
UPDATE2:
I've ended up doing string comparison:
select r.id,
count(r.id),
u.first_name,
u.avatar,
u.city
from pref_rep r, pref_users u where
r.nice=true and
to_char(current_timestamp - interval '1 month', 'IYYY-MM') =
to_char(r.last_rated, 'IYYY-MM') and
u.female=true and
r.id=u.id
group by r.id , u.first_name, u.avatar, u.city
order by count(r.id) desc
limit 1
Say you run it once on the first day of the month, and cache the results, since counting votes on every page is kinda useless.
First some date arithmetic :
SELECT now(),
date_trunc( 'month', now() ) - '1 MONTH'::INTERVAL,
date_trunc( 'month', now() );
now | ?column? | date_trunc
-------------------------------+------------------------+------------------------
2011-07-07 16:24:38.765559+02 | 2011-06-01 00:00:00+02 | 2011-07-01 00:00:00+02
OK, we got the bounds for the "last month" datetime range. Now we need some window function to get the first rows per gender :
SELECT * FROM (
SELECT *, rank( ) over (partition by gender order by score desc )
FROM (
SELECT user_id, count(*) AS score FROM pref_rep
WHERE nice=true
AND last_rated >= date_trunc( 'month', now() ) - '1 MONTH'::INTERVAL
AND last_rated < date_trunc( 'month', now() )
GROUP BY user_id) s1
JOIN users USING (user_id)) s2
WHERE rank=1;
Note this can give you several rows in case of ex-aequo.
EDIT :
I've got a nice suggestion to cast timestamps to strings in order to find records for the last month (not for the last 30 days)
date_trunc() works much better.
If you make 2 queries, you'll have to make the count() twice. Since users can potentially vote many times for other users, that table will probably be the larger one, so scanning it once is a good thing.
You can't "leave joining back onto the users table to the outer part of the query too" because you need genders...
Query above takes about 30 ms with 1k users and 100k votes so you'd definitely want to cache it.
精彩评论