开发者

Any idea why contains(...) querys so slow in SQL Server 2005

开发者 https://www.devze.com 2022-12-13 12:02 出处:网络
I\'ve got a simple select query which executes in under 1 second normally, but when I add in a contains(column, \'text\') into the where clause开发者_开发问答, suddenly it\'s running for 20 seconds up

I've got a simple select query which executes in under 1 second normally, but when I add in a contains(column, 'text') into the where clause开发者_开发问答, suddenly it's running for 20 seconds up to a minute. The table it's selecting from has around 208k rows.

Any ideas what would cause this query to run so slow with just the addition of the contains clause?


Substring matching is a computationally expensive operation. Is the field indexed? If this is a major feature implementation, consider a search-caching table so you can simply lookup where the words exist.


Depending on the search keyword and the median length of characters in the column it is logical that it would take a long time.

Consider searching for 'cookie' in a column with median length 100 characters in a dataset of 200k rows.

Best case scenario with early outs, you would do 100 * 200k = 20m comparisons Worst case scenario near missing on every compare, you would do (5 * 100) * 200k = 100m comparisons

Generally I would:

  • reorder your query to filter out as much as possible in advance prior to string matching
  • limit number of the results if you don't need all of them at once (TOP x)
  • reduce the number characters in your search term
  • reduce the number of search terms by filtering out terms that are likely to match a lot, or not at all (if applicable)
  • cache query results if possible (however cache invalidation can get pretty tricky if you want to do it right)


Try this:

SELECT *
FROM table
WHERE CONTAINS((column1, column2, column3), '"*keyword*"')  

Instead of this:

SELECT *
FROM table
WHERE CONTAINS(column1, '"*keyword*"') 
OR CONTAINS(column2, '"*keyword*"') 
OR CONTAINS(column3y, '"*keyword*"') 

The first one is a lot faster.


CONTAINS does a lot of extra work. There's a few things to note here:

  • NVarChar is always faster, so do CONTAINS(column, N'text')
  • If all you want to do is see if the word is in there, compare the performance to column LIKE '%' + text + '%'.
  • Compare query plans before and after, did it go to a table scan? If so, post more so we can figure out why.
  • In ultimo, you can break up the text's individual words into a separate table so they can be indexed.
0

精彩评论

暂无评论...
验证码 换一张
取 消