Recently, I've been asked to write a query to select properties of entities from a group that contains the maximum number of such entities. So, I did it on Northwind (MSFT distributed sample) database in a couple of ways.
ONE:
SELECT cat.CategoryName, prod.ProductName
FROM Categories cat
JOIN Products prod ON cat.CategoryID = prod.CategoryID
JOIN (SELECT TOP 1 p.CategoryID, COUNT(p.ProductId) as products
FROM Categories c
JOIN Products p on c.CategoryID = p.CategoryID
GROUP BY p.CategoryID
ORDER BY products desc) c ON c.CategoryID = cat.CategoryID
TWO:
SELECT cat.CategoryName, prod.ProductName
FROM Categories cat
JOIN Products prod ON cat.CategoryID = prod.CategoryID
JOIN (SELECT CategoryID, COUNT(ProductID) m_count
FROM Products
GROUP BY CategoryID
HAVING COUNT(ProductID) = (SELECT MAX(sub.cnt)
开发者_如何学Python FROM (SELECT CategoryId, COUNT(ProductID) cnt
FROM Products
GROUP BY CategoryId) sub)) m ON m.CategoryID = cat.CategoryID
The question is: which is faster any why? In execution plans nothing stands out, in particular. The elapsed time varies slightly, but is the same, approximately. The database is, of course, tiny.
A tiny database makes it difficult to determine which is better, but SQL Server Management Studio has functionality to compare the efficiency of statements to one another.
- Open Management Studio
- Click the "New Query" button
- Click to enable the "Include Actual Query Plan"
- Post all the queries into the active query window
- Click the "Execute" button
- Click the "Execution plan" tab (to the left of the results) when it appears
The query cost is averaged by the number of queries run. So if comparing the two queries provided as examples, if both have a cost of 50% then they are equivalent (because 100 / 2 = 50, etc). When there's a difference, you can mouseover the SELECT to review the subtree cost, besides looking at the graphical layout of the Execution path.
Why is one faster than the other?
That depends on the database -- the data types being joined (are they as narrow as they could be? "narrow" means taking less bytes to store), indexes, and what is being performed in the query. Using different syntax can make all the difference.
Execute both SQL statements with SET SHOWPLAN_ALL ON
activated. The SQL server will give you query information, which should help your understanding what's going on and which one is/will be faster.
精彩评论