开发者

How do you normalize one-to-one-or-the-other relationships?

开发者 https://www.devze.com 2022-12-24 09:05 出处:网络
I\'m storing data on baseball statistics and would like to do so with 开发者_如何学运维three tables: players, battingStats, and pitchingStats. For the purpose of the question, each player will have ba

I'm storing data on baseball statistics and would like to do so with 开发者_如何学运维three tables: players, battingStats, and pitchingStats. For the purpose of the question, each player will have batting stats or pitching stats, but not both.

How would I normalize such a relationship in 3NF?


PlayerId would be a foreign key in both BattingStats and PitchingStats tables

[and remember to put some time dimension (season, year, et al) in the stats tables]

and by the way, this is a bad assumption: as far as I know, pitchers are allowed to bat, too!


Are you really required not to use more than 3 tables. Normalization normally implies breaking down one non-normalized model into many normalized relations.

If you can have more than 3 tables, you may want to consider the following (in 3NF):

Players:        ([player_id], name, date_of_birth, ...)
Batters:        ([batter_id], player_id)
Pitchers:       ([pitcher_id], player_id)
Batting_Stats:  ([batter_id, time_dimension], stat_1, stat_2, ...)
Pitching_Stats: ([pitcher_id, time_dimension], stat_1, stat_2, ...)

Attributes in [] define the primary key, but a surrogate key may be used if preferred. The player_id attribute in Batters and Pitches should have a unique constraint, and it should also be a foreign key to the Players relation. Batting_Stats and Pitching_Stats should also have a foreign key to Batters and Pitching respectively.

Note however that the above does not enforce that a player can be only a batter or only a pitcher.


UPDATE:

One method I am aware of to enforce that a player is only a batter or only a pitcher, is through this model:

Players:        ([player_id], name, date_of_birth, ...)
Roles:          ([role_id, role_type], player_id)
Batting_Stats:  ([role_id, role_type, time_dimension], stat_1, stat_2, ...)
Pitching_Stats: ([role_id, role_type, time_dimension], stat_1, stat_2, ...)

The role_type should define a pitcher or a batter. Batting_Stats and Pitching_Stats should have a composite foreign key to Roles using (role_id, role_type). A unique constraint on player_id in Roles would ensure that a player can only have one, and only one, role. Finally add check constraints so that Batting_Stats.role_type = 'Batter' and Pitching_Stats.role_type = 'Pitcher'. These check constraint guarantee that Batting_Stats is always describing a batter, and note a pitcher. The same applies for Pitching_Stats.


I know how I would implement this from a practical perspective (I'd create a UNIONed view over the disjoint tables and put a unique index on the player ID - therefore, they can only appear in one table).

Or in the players table, record what type of statistics they have, and then include that in the FK relationship from the stats tables.

But either of these is probably closer to the metal than you want.

0

精彩评论

暂无评论...
验证码 换一张
取 消