Have created a stored procedure which is utilised for monitoring purposes of a website.
On first run the procedure takes over a minute to execute and if run shortly after this it takes only a few seconds to run. The problem is that the script is scheduled to run at ten minute intervals and each time it runs, it takes over a minute which is too long.
Is any开发者_如何学运维one aware of how we can improve the performance of this query? I know there's a reason it runs slowly the first time and then quickly for any subsequent but have been unable to find an answer.
Here's the code, thanks in advance :)
SET NOCOUNT ON
SET DATEFORMAT ymd
declare @start datetime
declare @end datetime
set @start = DATEADD(dd,-1,GETDATE())
set @end = GETDATE()
declare @errorToday int
declare @unconfirmedToday int
set @unconfirmedToday =
(
SELECT COUNT([DateCreated])
FROM GenericLeadLogs WITH(NOLOCK)
WHERE DestinationConfirmation IS NULL
AND [DateCreated] BETWEEN @start AND @end
)
SET @errorToday =
(
SELECT COUNT([DateCreated])
FROM GenericLeadLogs WITH(NOLOCK)
WHERE Severity = 'Error'
AND [DateCreated] BETWEEN @start AND @end
)
CREATE TABLE #GenericLeadStats
(
UnconfirmedToday int null,
ErrorToday int null
)
INSERT INTO #GenericLeadStats (UnconfirmedToday, ErrorToday)
values(@unconfirmedToday, @errorToday)
SELECT * FROM #GenericLeadStats
DROP TABLE #GenericLeadStats
I re-wrote the stored procedure as:
SET NOCOUNT ON
SELECT SUM(CASE WHEN DestinationConfirmation IS NULL THEN 1 ELSE 0 END) AS unconfirmedToday,
SUM(CASE WHEN Severity = 'Error' THEN 1 ELSE 0 END) AS errorToday
INTO #GenericLeadStats
FROM GenericLeadLogs WITH(NOLOCK)
WHERE [DateCreated] BETWEEN DATEADD(dd,-1,GETDATE()) AND GETDATE()
SELECT * FROM #GenericLeadStats
DROP TABLE #GenericLeadStats
In SQL Server, the SELECT INTO clause creates a table that doesn't already exist. I'm leaving it, but it serves no purpose based on what's provided.
In general, databases are stored on disk. However, on all modern operating systems, files after being read are often cached in memory. The reason that the query runs slowly is that the data are being read from disk. After that first query, much of the data (if not all) that are needed to re-execute the query are cached and do not need to be actually read from disk (slow). After some period of time, if the machine is used for other purposes or other queries are run, the data are moved out of cache to make room for new data, so the query will be slow again as soon as the data need to be reread from disk.
You should be sure that there is an index on all columns in the were statement above. I'm assuming that is done? If so, then make check the query plan to see if they are being used. Finally, if the table is quite large, you could consider partitioning it if that makes sense to do in your data model.
Looking at your query I can't see anything glaringly wrong that would cause massive performance issues - it seems likely that the reason why the query is quicker the second time around is simply because the data required is still cached, in which case there isn't much you can do.
You should obtain an execution plan and possibly run your query through the query optimiser to see if you are missing any indexes - this may help improve the performance of your query if you are missing any indexes.
精彩评论