开发者

How and why combination of JOIN and subquery affects GROUP BY behavior in MySQL query?

开发者 https://www.devze.com 2023-01-04 17:21 出处:网络
I have 3 sql tables: Data36 (Data_ID:int <PK>, type:int), Data38(Data_ID:int <PK>, clientId:int),

I have 3 sql tables:

Data36 (Data_ID:int <PK>, type:int), 
Data38(Data_ID:int <PK>, clientId:int), 
Data47(Data_ID:int <PK>, payerID:int).

I thought the following queries are identical, because I don't use aggregate functions here and GROUP BY should behave the same way as DISTINCT. But they return very different result sets and I don't understand why. Please help me to understand defference between these queries.

Query 1 (returns 153 rows):

SELECT payer.Data_ID, payer.type
 FROM Data36 AS payer
 JOIN Data38 AS serv ON payer.Data_ID = serv.clientId
 WHERE ((SELECT count(*) FROM Data47 AS regsites WHERE regsites.payerID = payer.Data_ID) = 0)
 GROUP BY payer.Data_ID, payer.type

Query 2 (returns 4744 rows):

SELECT DISTINCT payer.Data_ID, payer.type
 FROM Data36 AS payer
 JOIN Data38 AS serv ON payer.Data_ID = serv.clientId
 WHERE ((SELECT count(*) FROM Data47 AS regsites WHERE regsites.payerID = payer.Data_ID) = 0)

SQL Server version is 5.0.40.

Let me know if you need more specific information.

Update: Sorry for not mentioning this: Data_ID is a Primary Key in these tables, so Data_ID is unique for each record in these tables.

SELECT count( *  ) FROM Data36 
--returns 5998
SELECT count(DISTINCT Data_ID) FROM Data36 
--returns 5998

Update 2: In Query 1 I changed '开发者_StackOverflowGROUP BY payer.Data_ID' to 'GROUP BY payer.Data_ID, payer.type'. The result is still the same - 153 rows.


Whan happens when you try this?

select payer.Data_ID, payer.type from
(
SELECT DISTINCT Data_ID, type 
 FROM Data36
) AS payer 
 JOIN Data38 AS serv ON payer.Data_ID = serv.clientId 
 WHERE ((SELECT count(*) FROM Data47 AS regsites WHERE regsites.payerID = payer.Data_ID) = 0) 


Correct and most efficient way to found records that hasn't coresponding records in the third table is:

SELECT payer.Data_ID, payer.type
FROM Data36 AS payer
JOIN Data38 AS serv ON payer.Data_ID = serv.clientId
LEFT JOIN Data47 AS regsites  ON(regsites.payerID = payer.Data_ID)
WHERE regsites.payerID IS NULL
GROUP BY payer.Data_ID


Looking at your query there appear to be only one functional difference...

Query1 (153 records):

SELECT payer.Data_ID, payer.type FROM <blah> GROUP BY payer.Data_ID

Query2 (4744 records):

SELECT DISTINCT payer.Data_ID, payer.type FROM <blah>


As Query1 only groups by Data_ID I would say that you have 153 Data_IDs but that each Data_ID can have many different type values.

Query2 is returning every different Data_ID/type combination, but Query1 is only returning 1 record for each Data_ID.


I'm also surprised that Query1 runs at all, as I would expect you would be Required to have an aggregate function such as MIN/MAX around the type field.

0

精彩评论

暂无评论...
验证码 换一张
取 消