I am using SQL 2000, and I am running a simple select statement on a table containing about 30 million rows. The select query looks like:
select col1, col2, col3 from Table1 where col4=@col4 and col5=@col5 and col6=@col6
The table has a clustered index in it (i.e. a primary key), but that is not being used as a where criteria. All the where criterias mentioned above have no indexed in them.
How can I optimize this query?
If I add indexes for each column in the where clause, would that make any difference?
If I have 10 columns in where clause, should all of those 10 columns have index in them?
Edit: This is probably one of the most common interview question :)
Yes, it will make a huge difference.
Instead of adding one index for each field, you should add one index that has the three fields. (How this is used in practice of course depends on how unique the fields are and what other queries you are going to use on the table.)
Note that adding an index also has a small negative impact when you insert or delete records into the table or update the indexed fields of a record.
How can I optimize this query?
You can make a covering index:
CREATE INDEX ix_table1_456__123 ON table1 (col4, col5, col6) INCLUDE (col1, col2, col3)
and the query will not even have to do a table lookup.
If I add indexes for each column in the where clause, would that make any difference?
This will most probably improve the query compared to not having indexes at all, but creating the composite index with covering will most probably be better.
However, if each of your columns has high cardinality (that is it is UNIQUE
or close to it), then creating individual indexes may even improve the query compared to the composite index.
This is especially true if some of the columns is large in size (like a VARCHAR(400)
) and another, small column has high cardinality.
If I have 10 columns in where clause, should all of those 10 columns have index in them?
If you have 10
columns, there's, as I said above, a tradeoff between the increased key size (which degrades performance) and increased selectivity.
If, say, the first 3
columns are unique or almost unique, then adding the additional columns won't increase selectivity but will increase the key size.
The index will get larger in size which will require extra time to search in it.
You should not create the index on all 10
columns if 3
columns offer selectivity which is high enough, since traversing a larger index will be more expensive than reading some extra keys.
You may want to read this article in my blog:
- Creating indexes
What column from criteria is most selective? Creating index on that column would affect performance most. Should you add another columns to same index or not, depends on selectivity. You need to examine query plans to find that out :)
If I add indexes for each column in the where clause, would that make any difference?
Yes, adding an index will make a huge performance difference. This comes at the cost of significant disk space usage by the indexes, and a very small impact on INSERT and UPDATE commands.
If I have 10 columns in where clause, should all of those 10 columns have index in them?
That is not always the case. If we take the SQL query you provided as an example, and you create an index on col4 only, it might be that select * from Table1 where col4=@col4
returns only a few records. In that case, you will not gain much from having an index on col5 and col6, because the database engine would have to scan only those few records returned where col4=@col4
.
Therefore as you can see, this depends a lot on the type of data that you are storing. In addition, an index may also not be much useful on any column that has low cardinality: i.e. columns with few unique values.
精彩评论