开发者

Different execution plan for similar queries

开发者 https://www.devze.com 2023-01-02 12:59 出处:网络
I am running two very similar update queries but for a reason unknown to me they are using completely different execution plans.Normally this wouldn\'t be a problem but they are both updating exactly

I am running two very similar update queries but for a reason unknown to me they are using completely different execution plans. Normally this wouldn't be a problem but they are both updating exactly the same amount of rows but one is using an execution plan that is far infe开发者_C百科rior to the other, 4 secs vs 2 mins, when scaled up this is causing me a massive problem.

The only difference between the two queries is one is using the column CLI and the other DLI. These columns are exactly the same datatype, and are both indexed exactly the same, but for the DLI query execution plan, the index is not used.

Any help as to why this is happening is much appreciated.

-- Query 1
UPDATE a
 SET DestKey = ( 
 SELECT TOP 1 b.PrefixKey 
 FROM refPrefixDetail AS b 
 WHERE a.DLI LIKE b.Prefix + '%' 
 ORDER BY len(b.Prefix) DESC )
FROM CallData AS a

-- Query 2
UPDATE a
 SET DestKey = ( 
 SELECT TOP 1 b.PrefixKey 
 FROM refPrefixDetail b 
 WHERE a.CLI LIKE b.Prefix + '%' 
 ORDER BY len(b.Prefix) DESC )
FROM CallData AS a


Examine the statistics on these two columns on the table (How the data values for the columns are distributed among all the rows). This will propbably explain the difference... One of these columns may have a distribution of values that could cause the query, in processsing, to need to examine a substantially higher number of rows than would be required by the other query, (The number or rows updated is controlled by the Top 1 part remember) then it is possible that the query optimizer will choose not to use the index... Updating statistics will make them more accurate, but if the distribution of values is such that the optimizer chooses not to use the index, then you may be out of luck...

Understanding how indices work is useful here. An index is a tree-structure of nodes, where each node (starting with a root node) contains information that allows the query processor to determine which branch of the tree to go to next, based on the value it is "searching" for. It is analogous to a binary-Tree except that in databases the trees are not binary, at each level there may be more than 2 branches below each node.

So, for an index, to traverse the index, from the root to the leaf level, requires that the processor read the index once for each level in the index hiearchy. (if the index is 5 levels deep for example, it needs to do 5 I/O operations for each record it searches for.

So in this example, say, if the query need to examine more than approximately 20% of the records in the table, (based on the value distribution of the column you are searching against), then the query optimizer will say to itself, "self, to find 20% of the records, with five I/O s per each record search, is equal to the same number of I/Os as reading the entire table.", so it just ignores the index and does a Table scan.

There's really no way to avoid this except by adding additonal criteria to your query to furthur restrict the number of records the query must examine to generate it's results....


Try updating your statistics. If that does not help try rebuilding your indexes. It is possible that the cardinality of the data in each column is quite different, causing different execution plans to be selected.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号