How can I compare two queries X and Y and say that X is better than Y, when they both take almost the same time in small cases scenarios?
The problem is that I ha开发者_JAVA百科ve two queries that are supposed to run on a very big database, so run and evaluate is not quite an option. Therefore, we created a small database to perform some tests. Evaluating which query is better is a problem, since on our test base, they run in almost the same time (about 5 minutes). Besides the time taken to return, what is another way to measure how good a query is?
SET STATISTICS IO ON
SET STATISTICS TIME ON
Run the queries and compare logical reads for the various tables and execution times.
As already mentioned, check the execution plans.
Importantly, compare the 2 queries fairly by clearing the cache down between each run, just to make sure you're not seeing skewed results due to the effect of data already being cached (don't run on production server):
DBCC DROPCLEANBUFFERS -- clear data cache
DBCC FREEPROCCACHE -- clear proc plan cache
Then what I usually do is check the Reads, Writes, CPU and Duration for a comparison.
It's very important that you test with production-level data volumes (and ideally greater to see how it will scale). It's at those volumes that you'll really see any performance difference. Testing with small data volumes could leave you open to problems later on.
Have you examined the query plans? If the queries are returning the same data and are taking the same amount of time to execute, my guess is that the query plans will be nearly identical meaning that there is no meaningful difference between the two queries.
Also, have you taken into account that queries perform differently as the database size changes?
I'm wondering if you are prematurely optimizing the code. In my mind, if I have a query that works and is understandable, I can address performance issues through indexes. And that is usually easier than changing the queries to improve performance.
Evaluating query performance on a significantly different data set generally makes very little sense. Query plans and their efficiency can vary greatly depending on the data stats.
So to get any realistic estimates, you need a database as close to the "real" one as possible. Best of all, take a copy of your "big database" and tune your queries to it.
精彩评论