I'm working on my first assignment using SQL on our class' PostgreSQL server. A sample database has the (partial here) schema:
CREATE TABLE users (
id int PRIMARY KEY,
userStatus varchar(100),
userType varchar(100),
userName varchar(100),
email 开发者_如何学Cvarchar(100),
age int,
street varchar(100),
city varchar(100),
state varchar(100),
zip varchar(100),
CONSTRAINT users_status_fk FOREIGN KEY (userStatus) REFERENCES userStatus(name),
CONSTRAINT users_types_fk FOREIGN KEY (userType) REFERENCES userTypes(name)
);
CREATE TABLE events (
id int primary key,
title varchar(100),
edate date,
etime time,
location varchar(100),
user_id int, -- creator of the event
CONSTRAINT events_user_fk FOREIGN KEY (user_id) REFERENCES users(id)
);
CREATE TABLE polls (
id int PRIMARY KEY,
question varchar(100),
creationDate date,
user_id int, --creator of the poll
CONSTRAINT polls_user_fk FOREIGN KEY (user_id) REFERENCES users(id)
);
and a bunch of sample data (in particular, 127 sample users).
I have to write a query to find the number of polls created by a user within the past year, as well as the number of events created by a user that occurred in the past year. The trick is, I should have rows with 0s for both columns if the user had no such polls/events.
I have a query which seems to return the correct data, but only for 116 of the 127 users, and I cannot understand why the query is trimming these 11 users, when the WHERE clause only checks attributes of the poll/event. Following is my query:
SELECT u.id, u.userStatus, u.userType, u.email, -- Return user details
COUNT(DISTINCT e.id) AS NumEvents, -- Count number of events
COUNT(DISTINCT p.id) AS NumPolls -- Count number of polls
FROM (users AS u LEFT JOIN events AS e ON u.id = e.user_id) LEFT JOIN polls AS p ON u.id = p.user_id
WHERE (p.creationDate IS NULL OR ((now() - p.creationDate) < INTERVAL '1' YEAR) OR -- Only get polls created within last year
e.edate IS NULL OR ((now() - e.edate) < INTERVAL '1' YEAR)) -- Only get events that happened during last year
GROUP BY u.id, u.userStatus, u.userType, u.email;
Any help would be much appreciated.
Using a different query seemed to work. Here's what I ended up with:
SELECT u.id, u.userStatus, u.userType, u.email, COUNT(DISTINCT e.id) AS numevents, COUNT(DISTINCT p.id) AS numpolls
FROM users AS u LEFT OUTER JOIN (SELECT * FROM events WHERE ((now() - edate) < INTERVAL '1' YEAR)) AS e ON u.id = e.user_id
LEFT OUTER JOIN (SELECT * FROM polls WHERE ((now() - creationDate) < INTERVAL '1' YEAR)) AS p ON u.id = p.user_id
GROUP BY u.id, u.userStatus, u.userType, u.email
;
Try to avoid using DISTINCT with sub-queries for example.
精彩评论