I'm exploring using an ORM tool in our development shop, and in particular Entity Framework 4.0. Since we work with VERY large databases, I'm a bit concerned about the query's it generates.
Doing something simple like getting clients with an address in a state looks like below.
As a database developer or admin would you allow this? Is it as bad as it looks? Assume every join is on a clustered index.
SELECT
[Project2].[ClientKey] AS [ClientKey],
[Project2].[FirstName] AS [FirstName],
[Project2].[LastName] AS [LastName],
[Project2].[IsEnabled] AS [IsEnabled],
[Project2].[ChangeUser] AS [ChangeUser],
[Project2].[ChangeDate] AS [ChangeDate],
[Project2].[C1] AS [C1],
[Project2].[AddressKey] AS [AddressKey],
[Project2].[ClientKey1] AS [ClientKey1],
[Project2].[AddressTypeCode] AS [AddressTypeCode],
[Project2].[PrimaryAddress] AS [PrimaryAddress],
[Project2].[AddressLine1] AS [AddressLine1],
[Project2].[AddressLine2] AS [AddressLine2],
[Project2].[City] AS [City],
[Project2].[State] AS [State],
[Project2].[ZIP] AS [ZIP]
FROM ( SELECT
[Distinct1].[ClientKey] AS [ClientKey],
[Distinct1].[FirstName] AS [FirstName],
[Distinct1].[LastName] AS [LastName],
[Distinct1].[IsEnabled] AS [IsEnabled],
[Distinct1].[ChangeUser] AS [ChangeUser],
[Distinct1].[ChangeDate] AS [ChangeDate],
开发者_C百科 [Extent3].[AddressKey] AS [AddressKey],
[Extent3].[ClientKey] AS [ClientKey1],
[Extent3].[AddressTypeCode] AS [AddressTypeCode],
[Extent3].[PrimaryAddress] AS [PrimaryAddress],
[Extent3].[AddressLine1] AS [AddressLine1],
[Extent3].[AddressLine2] AS [AddressLine2],
[Extent3].[City] AS [City],
[Extent3].[State] AS [State],
[Extent3].[ZIP] AS [ZIP],
CASE WHEN ([Extent3].[AddressKey] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
FROM (SELECT DISTINCT
[Extent1].[ClientKey] AS [ClientKey],
[Extent1].[FirstName] AS [FirstName],
[Extent1].[LastName] AS [LastName],
[Extent1].[IsEnabled] AS [IsEnabled],
[Extent1].[ChangeUser] AS [ChangeUser],
[Extent1].[ChangeDate] AS [ChangeDate]
FROM [Common].[Clients] AS [Extent1]
INNER JOIN [Common].[ClientAddresses] AS [Extent2] ON [Extent1].[ClientKey] = [Extent2].[ClientKey]
WHERE (( CAST(CHARINDEX(UPPER('D'), UPPER([Extent1].[LastName])) AS int)) > 0) AND ([Extent1].[IsEnabled] = 1) AND ([Extent2].[City] IS NOT NULL) AND ((UPPER([Extent2].[City])) = (UPPER('Colorado Springs'))) ) AS [Distinct1]
LEFT OUTER JOIN [Common].[ClientAddresses] AS [Extent3] ON [Distinct1].[ClientKey] = [Extent3].[ClientKey]
) AS [Project2]
ORDER BY [Project2].[ClientKey] ASC, [Project2].[FirstName] ASC, [Project2].[LastName] ASC, [Project2].[IsEnabled] ASC, [Project2].[ChangeUser] ASC, [Project2].[ChangeDate] ASC, [Project2].[C1] ASC
inner join addresses to clients ...
INNER JOIN [Common].[ClientAddresses] AS [Extent2]
ON [Extent1].[ClientKey] = [Extent2].[ClientKey]
with condition on city ...
AND ((UPPER([Extent2].[City])) = (UPPER('Colorado Springs')))
now left join addresses again?
LEFT OUTER JOIN [Common].[ClientAddresses] AS [Extent3]
ON [Distinct1].[ClientKey] = [Extent3].[ClientKey]
The left join is redundant since a row must already exist to match the condition.
Also it is selecting distinct rows in the inner query which is probably unnecessary if the tables are normalized
I would not trust the ORM to generate optimal queries.
There appears to be an unnecessary derived table, which almost doubles the length of the query. It is unlikely to have a significant effect on the run time performance, so it's not worth worrying about.
More troubling from a performance perspective is the use of functions in the WHERE clause that will prevent the use of indexes.
WHERE ...
AND ((UPPER([Extent2].[City])) = (UPPER('Colorado Springs')))
It should just be this with an appropriate case-insensitive collation:
WHERE ...
AND [Extent2].[City] = 'Colorado Springs'
But it may not be a problem in practice.
精彩评论