Inline table valued UDF performance_问答_开发者

I’m using SQL Server 2008R2. I’ve written the following table-valued UDF that accepts either scalar values this or that or both as parameters and returns a table with columns ID, this, and that. I’m seeing terrible performance when I call this function from inside a complex query, but not when I call it in simple queries. I’d like to know if anyone has any ideas about what I’m doing that’s slowing things down. Function definition follows:

CREATE function dbo.fn_getThisThat (@this nvarchar(255), @that nvarchar(255))
RETURNS TABLE
RETURN

SELECT These.this, Those.that, COALESCE(These.ID, Those.ID) as ID
FROM 
    (
    SELECT col1 as ‘开发者_JAVA百科this’, value1, value2, ID
    FROM (
        SELECT t1.col1, t1.col2, t1.col3, t2.col1
        FROM t1
        JOIN t2
            ON t1.col1 = t2.col1
        WHERE t2.col2 = ‘this’
        AND t1.col1 in (‘value1’, ‘value2’)
        ) SOURCE
    PIVOT (
        MAX(t1.col3) FOR t1.col1 in (value1, value2)
        ) AS pvt
    ) These
JOIN
    (
    SELECT t1.col1, t1.col2, t2.col1, t3.ID
    FROM t3
    JOIN t1
        ON t3.col1 = t1.col1
    JOIN t2
        ON t2.col1 = t1.col1
    WHERE t3.col3 = ‘value3’
    AND t1.col3 = ‘value1’
    AND t2.col3 = ‘value2’
    ) Those
WHERE that = @that
OR this = @this

The following statement process very quickly (< 1 sec) when passing scalar parameters:

SELECT * FROM dbo.fn_getThisThat(scalarValue, null)

Or in a relatively simple query like:

SELECT t1.col1, t1.col2, fn.This
FROM t1
CROSS APPLY dbo.fn_getThisThat(t1.col3, null)

…but it lags HORRIBLY (from processing time of ~1second to ~2:30 seconds) when called in a more complex query like this (in pseudo code: let me know if it’s not enough info):

DECLARE @table (a, b, c)
INSERT @table (a, b, c)
SELECT (values)

SELECT t1.c1, t1.c2, t1.c3
FROM
    (
    SELECT a.c1, COUNT(fn.That) as c2, COUNT(a.c2) as c3
    FROM a
    JOIN b ON (join terms)
    CROSS APPLY dbo.fn_getThisThat(a.c2, null) fn
    WHERE a.c1 IN (SELECT a FROM @table)
    GROUP BY a.c1
    ) t1

Does anyone have any suggestions as to what I’m doing to kill the speed in this second query? I changed the function to accept an array rather than scalar parameters, but that eliminated my ability to cross apply (in the last code snip). The performance hit, as far as I can tell from the query analyzer is from the cross apply of my function. I thought I wouldn’t run into RBAR since my UDF wasn’t multi-statement, but maybe I’m horribly wrong…?

EDIT: One more thing: The query execution plan shows that the function itself contributes only 2% to the batch; the larger query contributes 98%, but most of its cost is from Index Seeks and Table Scans, not from Parallelism. This made me think that perhaps the function call is not so much to blame for the sluggishness of the query, but rather the lack of indices on some of the tables involved (unfortunately I don't have a whole lot of control over the addition of indices.). I ran the query without the call to the function, and the table scans and index seeks still show high, but the query completes in around 8 seconds. So, we're back to the function...?

You may want to change your UDF to properly use two-part table names everywhere so you can add a SCHEMABINDING clause to it. See Improving query plans with the SCHEMABINDING option on T-SQL UDFs.

From the MSDN article for Apply (MSDN - Apply):

"The APPLY operator allows you to invoke a table-valued function for each row returned by an outer table expression of a query."

Your example shows a group by. Is it possible to call your function after the rows are grouped rather than in that particular query? That would cut down on the amount of rows the function has to be invoked for.

Failing that, my other suggestion would be to squeeze out as much performance gain as possible in the function itself by optimizing the query there. Every millisecond faster you can make it will add up.

As already pointed out, CROSS APPLY is invoked for every row in the outer query. So, key issue here is how many rows are returned from:

DECLARE @table (a, b, c)
INSERT @table (a, b, c)
SELECT (values)

SELECT t1.c1, t1.c2, t1.c3
FROM
    (
    SELECT a.c1
    FROM a
    JOIN b ON (join terms)
    WHERE a.c1 IN (SELECT a FROM @table)
    ) t1

This is the number of calls that will be made to your TVF. If (and it's a big if) the TVF has similar execution time for any value of a.c2, then a relevant performance comparison is the singular execution time of your function * the rows returned from the query above.

Difficult to be certain, due to the obfuscation/generalisation of your original query, but I suspect your TVF could be eliminated and the logic in-lined to the parent query. Thats likely to get you the best performance if it's feasible.

I think your best bet is to run this in SSMS and check out your execution plan. Since this is an inline table-valued UDF, the optimizer will incorporate it into the execution plan and you should be able to see where things are going off the rails.

I don't have much experience using PIVOT subqueries in CROSS APPLY situations -- that strikes me as something that might be an issue. But the execution plan will tell you for sure.

So far I've been able to increase performance from ~2:30 to ~0:17. It's better, but still not ideal. I did the following:

Added schemabinding to my tv udf (thank you, Remus!). This helped, but seemed to have less of an effect on performance than did the following.
Restructured main query to join on @table rather than referencing it in a subquery: this seemed to help the most and was where most of the performance gains seemed to come from.

I think my remaining lag is due to some missing indices on large tables I'm hitting heavily, but without the ability to add them I'm not sure what I can do atm. I have reduced the cost from parallelism to 0% as reported by the query analyzer, so I think I've done all I can with regard to the function call.

Thanks, everyone!