开发者

T-SQL: Performance when using an inner query on a join

开发者 https://www.devze.com 2023-02-11 12:52 出处:网络
Here\'s the situation I\'m dealing with. The original query I had was: SELECT c.CustID, a.City, a.Country FROM Customers AS c

Here's the situation I'm dealing with. The original query I had was:

SELECT c.CustID, a.City, a.Country FROM Customers AS c
    LEFT OUTER JOIN Addresses AS a ON c.CustID = a.CustID
    WHERE c.LastName = 'Jones'

So I'd like to show all customers with last name Jones, even ones without any address entries, and show associated addresses for them. But what if I want a WHERE clause on the addresses but still show all customers? For example if I do this:

SELECT c.CustID, a.City, a.Country FROM Customers AS c
    LEFT OUTER JOIN Addresses AS a ON c.CustID = a.CustID
    WHERE c.LastName = 'Jones' AND a.Country = 'United States'

I lose any customers that are not in the United States. But that's not what I want. I want all customers with last name 'Jones', and only omit addresses that are not in the United States. This is the solution I came 开发者_StackOverflow社区up with:

SELECT  c.CustID, a.City, a.Country FROM Customers AS c
    LEFT OUTER JOIN
        (SELECT City, Country FROM Addresses
        WHERE Country = 'United States') AS a
    ON c.CustID = a.CustID
    WHERE c.LastName = 'Jones'

In this case, I still get all customers with the last name Jones, but don't see addresses that are outside the US, which is what I wanted.

Here's my issue: In the third query, I'm assuming that SQL Server fetches all US addresses and then does the join with the Customers table, meaning many non-Jones addresses were fetched unnecessarily. In the second query, I'm wondering if SQL Server only fetches US addresses where LastName = 'Jones' in the first place, which I would think would make the query far faster. So is there a performance increase to the 2nd query over the 3rd? Also, whatever answer you give, if could you also comment on any differences when dealing inner joins instead (if there are any) that'd be great.

Thanks!


You can simply add your additional constraints to your left join (the word outer here is redundant, as all LEFT and RIGHT joins are automatically OUTER, and all unqualified joins are automatically INNER):

SELECT 
    c.CustID, 
    a.City, 
    a.Country 

FROM Customers AS c

LEFT JOIN Addresses AS a ON c.CustID = a.CustID AND a.Country = 'United States'

WHERE c.LastName = 'Jones'

As to your question regarding performance, the only real answer is to run both queries and compare the execution plans. That being said, the purely join-based syntax is substantially simpler, and will likely result in a simpler (and faster) execution plan.


Make the test for country part of the join condition.

SELECT c.CustID, a.City, a.Country 
    FROM Customers AS c
        LEFT OUTER JOIN Addresses AS a 
            ON c.CustID = a.CustID
                AND a.Country = 'United States'
    WHERE c.LastName = 'Jones'


A simpler form of your query would be:

SELECT c.CustID, a.City, a.Country
FROM Customers AS c     
LEFT OUTER JOIN Addresses AS a 
    ON c.CustID = a.CustID  AND a.Country = 'United States'     
WHERE c.LastName = 'Jones' 

To see if it would be a better query than yours, I would suggest looking at the Execution plans.

The reason your query 2 does not work is by putting the condition in the where clause you have effectively made it an inner join.


SELECT c.CustID, a.City, a.Country
  FROM Customers AS c LEFT OUTER JOIN Addresses AS a ON c.CustID = a.CustID
 WHERE c.LastName = 'Jones'
   AND (a.Country = 'United States' OR a.Country IS NULL)


While you're looking into performance, you may want to try out the APPLY operator:

SELECT c.CustID, a.City, a.Country  
FROM Customers AS c 
    OUTER APPLY 
    (
        SELECT City, Country
        FROM Addresses  
        WHERE c.CustID = a.CustID 
            AND a.Country = 'United States' 
    ) AS a
WHERE c.LastName = 'Jones'

Sometimes, when the table on the right is large, and the optimizer thinks more than 5% or so of the records will be returned, you get a table scan and hash match with an outer join. However, an index seek and nested loop would be faster. I've used the APPLY operator in a number of these situations to dramatically improve performance.

I don't know how this will turn out in your case, but it might be worth comparing execution times and plans to see which is best.

0

精彩评论

暂无评论...
验证码 换一张
取 消