I have a question about how MS SQL evaluates functions inside CTEs. A couple of searches didn't turn up any results related to this issue, but I apologize if this is common knowledge and I'm just behind the curve. It wouldn't be the first time :-)
This query is a simplified (and obviously less dynamic) version of what I'm actually doing, but it does exhibit the problem I'm experiencing. It looks like this:
CREATE TABLE #EmployeePool(EmployeeID int, EmployeeRank int);
INSERT INTO #EmployeePool(EmployeeID, EmployeeRank)
SELECT 42, 1
UNION ALL
SELECT 43, 2;
DECLARE @NumEmployees int;
SELECT @NumEmployees = COUNT(*) FROM #EmployeePool;
WITH RandomizedCustomers AS (
SELECT CAST(c.Criteria AS int) AS CustomerID,开发者_开发技巧
dbo.fnUtil_Random(@NumEmployees) AS RandomRank
FROM dbo.fnUtil_ParseCriteria(@CustomerIDs, 'int') c)
SELECT rc.CustomerID,
ep.EmployeeID
FROM RandomizedCustomers rc
JOIN #EmployeePool ep ON ep.EmployeeRank = rc.RandomRank;
DROP TABLE #EmployeePool;
The following can be assumed about all executions of the above:
The result of
dbo.fnUtil_Random()
is always an int value greater than zero and less than or equal to the argument passed in. Since it's being called above with@NumEmployees
which has the value 2, this function always evaluates to 1 or 2.The result of
dbo.fnUtil_ParseCriteria(@CustomerIDs, 'int')
produces a one-column, one-row table that contains a sql_variant with a base type of 'int' that has the value 219935.
Given the above assumptions, it makes sense (to me, anyway) that the result of the expression above should always produce a two-column table containing one record - CustomerID and an EmployeeID. The CustomerID should always be the int value 219935, and the EmployeeID should be either 42 or 43.
However, this is not always the case. Sometimes I get the expected single record. Other times I get two records (one for each EmployeeID), and still others I get no records. However, if I replace the RandomizedCustomers CTE with a true temp table, the problem vanishes completely.
Every time I think I have an explanation for this behavior, it turns out to not make sense or be impossible, so I literally cannot explain why this would happen. Since the problem does not happen when I replace the CTE with a temp table, I can only assume it has something to do with the functions inside CTEs are evaluated during joins to that CTE. Do any of you have any theories?
SQL Server
's optimizer is free to decide whether to reevaluate a CTE
or not.
For instance, this query:
WITH q AS
(
SELECT NEWID() AS n
)
SELECT *
FROM q
UNION ALL
SELECT *
FROM q
will produce two different NEWID()
's, however, if you use cached XML
plan to wrap the CTE
into an Eager Spool
operation, the records will be same.
精彩评论