I'm trying to run the following MySQL query:
SELECT *
FROM user u
JOIN user_categories uc ON u.user_id = uc.user_id
WHERE (uc.category_id = 3 OR uc.category_id = 1)
It currently returns:
Joe,Smith,606开发者_如何转开发57,male
Joe,Smith,60657,male
Mickey,Mouse,60613,female
Petter,Pan,60625,male
Petter,Pan,60625,male
Donald,Duck,60615,male
If the user belongs to both categories it currently returns them twice. How can I return the user only once without using SELECT DISTINCT, regardless of how many categories they belong to?
You need a semi join. This can be achieved with a sub query.
SELECT *
FROM user u
WHERE EXISTS(SELECT *
FROM user_categories uc
WHERE u.user_id = uc.user_id AND
uc.category_id IN(1,3))
In MySQL the performance of sub queries is quite problematic however so a JOIN
and duplicate elimination via DISTINCT
or GROUP BY
may perform better.
I don't know about MySQL, but in Postgres you may get better performance in the semi-join version from
SELECT * FROM user u
WHERE u.user_id
IN (SELECT user_id FROM user_categories uc WHERE uc.category_id IN (1,3));
I would expect SELECT DISTINCT
to run fastest but I have learned my expectations and DB performance are often much different!
Try using a GROUP BY
SELECT * FROM user u
JOIN user_categories uc ON u.user_id = uc.user_id
WHERE uc.category_id = 3 OR uc.category_id = 1
GROUP BY u.user_id
精彩评论