Indexing foreign key columns_问答_开发者_运维开发者技术经验分享

Is it a rule of thumb to put an index on all foreig开发者_运维问答n key columns? It seems they would often be used in table joins and would beneifit from the index.

What if the FK column only has 2 or 3 possible values? For example if it was referencing a status table. Would it still be advisable to put an index on the FK field?

I would strongly disagree with Eelke - most SQL Server gurus will recommend to put a non-clustered index on foreign key columns, and I agree.

For the best summary of reasons and the best argumentation for those indices, see Kimberly Tripp's excellent blog post When did SQL Server stop putting indexes on foreign keys? - the answer is: never - it never did (that is one of many persistent urban myths - but it's still a myth).

Her core message is:

Are there any benefits to indexing foreign key columns? YES

Better performance on maintaining the relationship on a delete of a primary/unique key. When you delete a key row, SQL Server must check to see if there are any rows which reference the row being deleted.
o If the foreign key relationship is defined with NO ACTION (on update/delete) then a referenced row CANNOT be deleted as it would leave the referencing rows “orphaned.” To find the rows efficiently an index on the foreign key column helps!
o If the foreign key relationship is defined with CASCADE (on update/delete) then when a referenced row is modified all of the referencing rows must be modified as well (either updated to reflect the new value or on cascade delete). To find the rows to modify efficiently, an index on the foreign key column helps!

Better join performance - for many of the reasons above, SQL Server can more effectively find the rows to join to when tables are joined on primary/foreign key relationships. However, this is NOT always the “best” indexing choice for joins but it is a good start.

(Two issues come together here: machine efficiency and DBA efficiency.)

As a rule of thumb, query performance benefits from indexes on columns used in JOIN clauses and in WHERE clauses. And, of course, insert, update, and delete performance suffers from those same indexes. (Because the index has to be updated along with the table.)

If a column has few values, and there's a compelling reason to avoid storing the actual value in the table, I prefer a human-readable code to be the target of a foreign key reference. For example, ISO 5218 specifies these codes for representing human sexes.

Sex_id  Sex
--
0       Not known
1       Male
2       Female
9       Not applicable

I don't expect people to remember that 9 means not applicable. If this were all I had, I'd have to join this table every time I needed to produce output for people to read.

But I can add a column to that table, declare it unique, and use it as the target for foreign key references. I won't have to join this table most of the time. Heck, I might never have to join it. Avoiding a join has to do with machine efficiency.

sex_id  sex_code   sex
--
0       Un         Not known
1       M          Male
2       F          Female
9       NA         Not applicable

All three of those columns should be declared unique. And on most platforms, that means each of those three columns will get an index. I'd index the columns that reference sex_code, too, even though selectivity will be low.

Why? I have better things to do than to re-evaluate whether to add an index now, because the optimizer in this release is smart enough to take advantage of it. This has to do with DBA efficiency.

No it shouldn't be. Indexes should only be used when they are of use. Indexes only are of use when the table is of sufficient size say at least three blocks. That can easily be hundreds of rows for short rows.