开发者

Improve SQL query using functions

开发者 https://www.devze.com 2023-04-08 11:21 出处:网络
I am trying to improve query readability by using a function in SQL Server Express 2008. Here is a sample of what I\'m trying to do.

I am trying to improve query readability by using a function in SQL Server Express 2008. Here is a sample of what I'm trying to do.

I have a table where we store a max temperatures reading per hour of the day, then I want to select all days where the max temperature between 8-10AM was greater than the max temp between 12-2PM

So here is how it's like:

DECLARE @TableA TABLE ([Date] DATE, [Time] TIME(0), HighTemp DECIMAL(6,2)); 

INSERT @TableA VALUES 
('2011-09-10','08:00:00',38.15), 
('2011-09-10','09:00:00',38.32), 
('2011-09-10','10:00:00',38.17), 
('2011-09-10','11:00:00',38.10), 
('2011-09-10','12:00:00',38.05), 
('2011-09-10','13:00:00',38.15), 
('2011-09-10','14:00:00',38.30), 

('2011-09-11','08:00:00',38.12), 
('2011-09-11','09:00:00',38.09), 
('2011-09-11','10:00:00',38.07), 
('2011-09-11','11:00:00',38.15), 
('2011-09-11','12:00:00',38.13), 
('2011-09-11','13:开发者_如何学Go00:00',38.11), 
('2011-09-11','14:00:00',38.10), 

('2011-09-12','08:00:00',38.30), 
('2011-09-12','09:00:00',38.33), 
('2011-09-12','10:00:00',38.35), 
('2011-09-12','11:00:00',38.30), 
('2011-09-12','12:00:00',38.25), 
('2011-09-12','13:00:00',38.23), 
('2011-09-12','14:00:00',38.20)

select distinct [DATE] from @TableA maintbl
where 
-- Select the high temp between 08:00:00-10:00:00
(select MAX(HighTemp) from @TableA tmptbl where tmptbl.Time >= '08:00:00' and tmptbl.Time <= '10:00:00' and maintbl.Date = tmptbl.Date)
>
-- Select the high between 12:00:00-14:00:00
(select MAX(HighTemp) from @TableA tmptbl where tmptbl.Time >= '12:00:00' and tmptbl.Time <= '14:00:00' and maintbl.Date = tmptbl.Date)

The query runs well (fast) and the result for the above query should be: 2011-09-10 2011-09-12

Now, I have tried to simplify the query by using a function that retrieves the max tempreture in a certain day and time period, so the query is easier to read, like so:

select distinct [DATE] from @TableA maintbl
where GetPeriodHigh(maintbl.Date, '08:00:00', '10:00:00') > GetPeriodHigh(maintbl.Date, '12:00:00', '14:00:00')

And the function is like so:

CREATE FUNCTION [dbo].[GetPeriodHigh] 
(
    @Date date,
    @From time,
    @To time
)
RETURNS decimal(6,2)
AS
BEGIN

    declare @res decimal(6,2)

    select @res = MAX(high) from MyTable
    where Time >= @from and Time <= @to and Date = @Date

    return @res
END

The problem is that running the query using the function takes LOOONG time, actually I never saw it finishes, looks like it is in some sort of infinite loop...

Any ideas why is that, and there is anything I can do to simplify my query?

Thx.


Scalar functions that do data access generally suck and are best avoided. They don't get expanded out by the optimiser and this basically enforces the function query as being the inner side of a nested loops join regardless of suitability.

Making things worse you may well not have correct indexing to evaluate the Time >= @from and Time <= @to and Date = @Date predicate inside the function which means that for each row in the outer query you are enforcing 2 table scans via the function calls.

This lack of indexes is also the case in your source example and it can be seen with the inline version that the query optimiser is able to effectively rewrite this as two MAX / GROUP BY queries with different WHERE clauses then merge join the results together. When the logic is in scalar UDFs this kind of transformation is not currently considered.

Improve SQL query using functions

Another approach you could try is

SELECT [Date]
FROM @TableA
WHERE Time BETWEEN '08:00:00' AND '10:00:00' 
      OR Time BETWEEN '12:00:00' AND '14:00:00'
GROUP BY [Date]
HAVING MAX(CASE 
               WHEN Time BETWEEN '08:00:00' AND '10:00:00' THEN HighTemp END) > 
       MAX(CASE 
               WHEN Time BETWEEN '12:00:00' AND '14:00:00' THEN HighTemp END)


To improve performance, try rewriting the scalar valued UDF as an inline table valued UDF.

Some links :

http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/05/23/reuse-your-code-with-cross-apply.aspx

http://sqlblog.com/blogs/alexander_kuznetsov/archive/2008/04/21/not-all-udfs-are-bad-for-performance.aspx

0

精彩评论

暂无评论...
验证码 换一张
取 消