I am still surprised why such simple 开发者_如何学Goquery is not working:
SELECT COUNT(DISTINCT *) FROM dbo.t_test
Where as
SELECT COUNT(DISTINCT col1) FROM dbo.t_test
and
SELECT DISTINCT * FROM dbo.t_test
works.
What is the alternative?
EDIT:
DISTINCT *
checks for uniqueness for the combined key of (col1,col2,...) and returns those rows. I expected COUNT(DISTINCT *) to just return number of such rows. Am I missing anything here?
It doesn't work because you are only allowed to specify a single expression in COUNT(DISTINCT ...)
as per the documentation:
COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )
If you look carefully you can see that the allowed grammar doesn't include COUNT(DISTINCT *)
.
The alternative is this:
SELECT COUNT(*) FROM
(
SELECT DISTINCT * FROM dbo.t_test
) T1
The truth of the matter is that SQL (Server) or any other SQL implementation is not supposed to do everything under the sun.
There are reasons to limit the SQL syntax to certain elements, from the parsing layer to query optimization to predictability of results to just common sense.
The COUNT aggregate function is normally implemented as a streaming aggregate with a gate for a single item, be it *
(record count, just use a static token), or colname
(increment token only when not null) or distinct colname
(a hash/bucket with one key).
When you ask for COUNT(DISTINCT *)
or for that matter, COUNT(DISTINCT a,b,c)
- yes, it can surely be done for you if some RDBMS sees fit to implement it one day; but it is (1) uncommon enough (2) adds work to the parser (3) adds complexity to the COUNT implementation.
Mark has the correct alternative.
In addition to what the others have said:
One thing to be aware of is that doing a count(distinct *)
(if it was allowed) on a table that has a primary key would be identical to a select count(*)
.
This is because distinct * includes the PK column and therefor every row is distinct from every other row.
And as every non-trivial table should have a primary key (there are only very few exceptions to that rule) count(distinct *)
can be "replaced" with count(*)
anyway.
As a simple example, let's say you have two columns, A and B.
A B
1 100
2 100
3 100
There are three distinct A values, but only one distinct B value. It would be impossible for COUNT(DISTINCT *)
to return a single, meaningful value. That is why that syntax cannot work.
I had a same problem, finally make this solution
think you have something like this
PID | Name |
---|---|
1 | milk |
1 | cheese |
1 | tea |
2 | butter |
2 | cream |
3 | honey |
and your table was named "food" you will code like this
select distinct count(dbo.food.PID) as count,a.PID from dbo.food
inner join (select distinct dbo.food.PID as PID from db.food) a
on dbo.food.PID=a.PID where a.PID=dbo.food.PID
group by a.PID,dbo.food.PID
this will show something like this
count | PID |
---|---|
3 | 1 |
2 | 2 |
1 | 1 |
精彩评论