OK, this may be a silly question but...
I have inherited a project and am tasked with going over the primary key relationships.
The project largely uses Guids. I say "largely" because there are examples where tables use integral types to reflect enumerations. For example, dbo.MessageFolder has MessageFolderId of type int to reflect
public emum MessageFolderTypes
{
inbox = 1,
sent = 2,
trash = 3,
etc...
}
This happens a lot. There are tables with primary keys of type int which is unavoidable because of their reliance on enumerations and tables with primary keys of type Guid which reflect the pr开发者_C百科imary key choice on the part of the previous programmer.
Should I care that the PK schema is spotty like this? It doesn't feel right but does it really matter? If this could create a problem, how do I get around it (I really can't move all PKs to type int without serious legwork and I have never heard of enumerations that have guid values)?
Thanks.
Ideally you'd move away from GUIDs as PKs, due to the performance - But I understand that may be impossible.
The GUID performance is well laid out here: What are the performance improvement of Sequential Guid over standard Guid?
I wouldn't worry about the spottiness, if you use the smallest key you can then you're bound to have several different data types as PKs in a DB (TinyInt/SmallInt/Int/BigInt).
What you could do for a "quick win" is this:
- leave the primary key as it is (benefit: you don't need to change the referential integrity constraints)
- but make your GUID primary key a non-clustered primary key
- put the clustering key on a separate field - use something that's already available, if there is a useful field - or if not, use an INT IDENTITY field
The performance improvement you'd get from having a good clustering key can be very significant! Not just a few percent - several orders of magnitude.
Read here why GUIDs as Primary Key (actually: as clustering key) are a horribly bad choice, and read here The Clustered Index Debate - Again! what a good clustering key should be like - ideally:
- narrow (as few bytes as possible)
- static (never changes)
- unique (otherwise SQL Server will need to "uniquify" your entries by adding 4 bytes to them)
- ever-increasing (to avoid expensive page splits and reduce index/page fragmentation)
INT IDENTITY is an ideal candidate.
What matters most is that all the keys necessary to ensure integrity are being enforced. There is no real reason why the same type of column has to be used as a key in every table. Presumably you have other (natural) keys enforced by unique constraints/indexes as well and if so I would expect some of those include columns that aren't either integers or guids.
精彩评论