I've followed this question with a full example in case it isn't clear what I mean from the question.
I've made a view which joins data from about five tables. The tables have vast amounts of data and the queries are slow to run. My question is that if I do:
SELECT * FROM myView WHERE PersonID = 1000
does SQL Server 'know what I mean' and automatically propagate that condition to the underlying joins in the view? So that it doesn't run for everybody, but minimizes the result set at the right stages. Or will it run for everything then do the WHERE ID = 1000
on the full result set?
AN EXAMPLE
To simplify (...hopefully) what I mean, here's an example pseudo-TSQL scenario:
TABLE People (
ID,
Surname,
DOB
)
TABLE Activities (
ID,
TypeID,
LocationID,
Date
)
TABLE PersonActivityInvolvements (
ID,
PersonID,
ActivityID
)
TABLE ActivityTypes (
ID,
Name
)
TABLE Locations (
ID,
Street,
City
)
So I want a view which shows me all People
, any Activities
they were involved in, the ActivityType
, and the Location
it took place. Although this setup is not drastically complicated, you can see that it might take a very long time to execute if there are say tens of thousands of each entity.
The view might be something like this:
SELECT
*
FROM
People LEFT OUTER JOIN PersonActivityInvolvement PA
ON People.ID = PA.ID
INNER JOIN Activity
ON PA.ID = Activity.ID
INNER JOIN ActivityTypes AT
ON A.TypeID = AT.ID
INNER JOIN Locations
ON A.LocationID = Locations.ID
So if was to do
SELECT * FROM myView WHERE DOB >= dateAdd(YEAR, -18, getDate())
would the query inside 开发者_JAVA技巧the view run for everyone or would SQL Server know that it should apply it to the People.DOB
field?
This is called predicate pushing.
SQL Server is generally good at this though there are some constructs where there have been problems (e.g. see the final part of this article).
Check the execution plan to see where the predicate is applied.
Generally, the optimizer will handle this with aplomb, but as the query and implied subqueries become increasingly complex, the chances that the optimizer will choose the proper execution path diminishes accordingly. This can be aggravated by having more indexes on the tables, or very similar indexes on the tables it will be examining.
As a general rule, I try to discourage joining to views, and I strongly discourage creating views that are comprised of other views.
The engine will do whatever it thinks is fastest. If you have that field indexed, and your JOIN
keys are all indexed, it may or may not run that filter first.
It may actually run the filter LAST if the WHERE
clause is more expensive (i.e. unindexed) - that way the expensive operation is running on the smallest result set.
Ther only way to know for sure is to run the query and check the execution plan (ACTUAL not estimated).
精彩评论