I realise there might be similar questions but I couldn't find one that was close enough for guidance.
Given this spec,
Site
--------开发者_运维知识库-------------------
SiteID int identity
Name varchar(50)
Series
---------------------
SiteID int
SeriesCode varchar(6)
...
--SeriesCode will be unique for every unique SiteID
Episode
----------------------
SiteID int
SeriesCode varchar(6)
EpisodeCode varchar(10)
...
my proposed design/implementation is
Site
----------------------------
SiteID int identity
Name varchar(50)
Series
-------------------------------------------
SeriesID int identity, surrogate key
SiteID int natural key
SeriesCode varchar(6) natural key
UNIQUE(SiteID, SeriesCode)
...
Episode
-------------------------------------------
EpisodeID int identity, surrogate key
SeriesID int foreign key
EpisodeCode varchar(6) natural key
...
Anything wrong with this? Is it okay to have the SeriesID surrogate as a foreign* key here? I'm not sure if I'm missing any obvious problems that can arise. Or would it be better to use composite natural keys (SiteID+SeriesCode / SiteID+EpisodeCode)? In essence that'd decouple the Episode table from the Series table and that doesn't sit right for me.
Worth adding is that SeriesCode looks like 'ABCD-1' and EpisodeCode like 'ABCD-1NMO9' in the raw input data that will populate these tables, so that's another thing that could be changed I suppose.
*: "virtual" foreign key, since it's been previously decided by the higher-ups we should not use actual foreign keys
Yes, it all looks fine. The only (minor) point I might make is that unless you have another 4th child table hanging off of Episode, you probably don't need EpisodeId, as Episode.EpisodeCode is a single attribute natural key sufficient to identify and locate rows in Episode. It's no harm to leave it there, of course, but as a general rule I add surrogate keys to act as targets for FKs in child tables, and try to add a narural key to every table to indentify and control redundant data rows... So if a table has no other table with a FK referencing it, (and never will) I sometimes don't bother including a surrogate key in it.
What's a "virtual" foreign key? Did the higher-ups decide not to use foreign key constraints? In that case, you're not using foreign keys at all. You're just pretending to.
And is Episode the best choice for an entity? Doesn't it really mean Show or Podcast or so, and just happens to always be part of a series right now? If so, will that change in the future? Will Episode eventually be abused to encompass Show outside of a Series? In that case, tying Episode to Site via Series might come back to haunt you.
Given all that, and assuming that you as a grunt probably can't change any of it: if i was you i'd feel safer using natural keys wherever possible. In absence of foreign key constraints, it makes recognizing bad data easier, and if you have to resort to some SeriesCode='EMPTY' trickery later on that's easier with natural keys, too.
My suggestion:
Use natural/business as primary key whenever possible except in the following 3 situations:
- The natural/business key is unknown at the moment of inserting
- The natural/business key is not good ( it's not unique, it's liable to change frequently )
- The natural/business key is a composite of more than 3 columns and the table will have child tables
In situations 1 and 2 a surrogate key is requiered.
In situation 3 a surrogate key is strongly recommended.
精彩评论