I am learning SQL for a personal projects and seems that I don't quite get the COUNT function.
I 开发者_开发问答have a "sample" table with this sample data:
- NAME COLOR
- Tom red
- Tom blue
- Jerry yellow
- Keri yellow
- Paul red
- Bob yellow
- Bob red
- Mary green
What I am attempted to do is print out only those NAME values that have only one COLOR value which is yellow.
Here is the query I wrote but Bob is coming out, which is a mistake.
SELECT COUNT(NAME),NAME
FROM SAMPLE
WHERE (COLOR = 'yellow')
HAVING COUNT(*) = 1
GROUP BY NAME;
May someone tell me what I am doing incorrectly?
Thanks.
Try this:
SELECT COUNT(NAME),NAME
FROM SAMPLE
GROUP BY NAME
HAVING COUNT(*) = 1 AND MAX(COLOR) = 'yellow';
As @paxdiablo said, you need to leave the rows in the group until after you do the group by, so the count will be accurate. Then you can test for 'yellow'
in the HAVING
clause.
Even though it may seem redundant to use MAX()
like I did in the above example, it's good form because any expression in the HAVING
clause should use group-oriented functions. HAVING
restricts groups whereas WHERE
restricts rows.
It's because your where
clause is limiting the result set before the having
clause is checking.
Hence you are stripping out bob red
so that the only bob
left is the yellow
one. And it will have a count of 1.
This one works for me (albeit in DB2 but since I tend to prefer standard SQL, it should work on any DBMS):
select count(a.name), a.name
from sample a,
(select name from sample where color = 'yellow') b
where a.name = b.name
group by a.name
having count(a.name) = 1;
Yellow returns (no Bob):
--------
NAME
--------
1 Jerry
1 Keri
while red returns (no Tom or Bob):
-------
NAME
-------
1 Paul
The way this works is as follows:
- A subquery is run to get a list of all names that have the color yellow. They can also have other colors at this point. This restricts the names to Jerry, Keri and Bob.
- Then the "real" query is run getting the list of all names but only when they match one of the names in the subquery (so limiting it to names that have yellow).
- This is grouped by the name and we use the
count
aggregate function to combine rows with the same name and give us a count of the colors for each name. - Lastly we throw away those that have more than one color.
I'm assuming here that you won't have a row in the table with a duplicate name and color - in other words, you should have a primary key or other constraint on (name,color). If you did have duplicates, then the cross-join would result in more rows and you would have to use distinct
or group by
in the subquery).
Another method using an analytic function:
SELECT NAME
FROM (
SELECT NAME,
COLOR,
COUNT(*) OVER () ROWS_PER_NAME
FROM SAMPLE )
WHERE COLOR = 'yellow' AND
ROWS_PER_NAME = 1
Also, if very few NAME's had the color yellow I would try:
SELECT NAME,
COLOR
FROM SAMPLE P
WHERE COLOR = 'yellow' AND
NOT EXISTS (
SELECT null
FROM SAMPLE C
WHERE C.NAME = P.NAME
AND COLOR != 'yellow')
精彩评论