Here is a small design with the common NOT NULL UNIQUE constraints on the natural keys:
CREATE TABLE 'users' {
id int(10) NOT NULL AUTO_INCREMENT,
name NOT NULL UNIQUE,
email NOT NULL UNIQUE,
pass NOT NULL,
PRIMARY KEY ('id')
}
The NOT NULL UNIQUE constraint seems hackish to me. Having disjoint candidate keys seems denormalized to me, and the UNIQUE constraint seems like a bloated O(N) checking feature, so I'm inclined to use a design that has a relation for each natural key that maps the natural key to the surrogate key in t开发者_如何转开发he main relation.
CREATE TABLE users {
id int(10) NOT NULL AUTO_INCREMENT,
pass NOT NULL,
PRIMARY KEY ('id')
}
CREATE TABLE user_names {
name NOT NULL,
user_id NOT NULL,
PRIMARY KEY ('name')
}
CREATE TABLE user_emails {
email NOT NULL,
user_id NOT NULL,
PRIMARY KEY ('email')
}
This way, I implicitly enforce the unique constraint on user's emails and usernames while providing the luxury of being able to search for a user's info with their email or name in O(ln N + ln M) time (which I very much desire).
This only way I can ever see the first, more common design matching the performance of the second design is if the UNIQUE constraint implicitly indexed the table so that selects with, and therefore checks for uniqueness of, the natural keys can be done in O(ln N) time.
I suppose my question is, with regard to the performance insertions and selections with the natural keys, what is the best way to handle a table with 3 or more natural keys that is indexed by a surrogate key?
It seems that what you are describing is 6th Normal Form. Assuming your original table is in 5NF then your new schema consisting of 3 tables is in 6NF. Having three candidate keys does not violate 5NF but it would violate 6NF.
From the data integrity point of view however 6NF has significant disadvantages. It is normally the case that some dependencies are lost. For example your original table enforces the constraint that every user has a name and password. Your 6NF version can't do that - at least not in SQL if you want to permit inserts to all the tables. 6NF is useful for some specific situations (temporal data) but in general 5NF is more useful and desirable from a data integrity perspective.
This doesn't answer your performance question but I thought it was worth pointing out.
You are normalizing too much in my opinion. You will hurt performance not only on inserts/updates but also on selects since you are now joining 3 tables instead of doing a straight insert/select/update/delete in one table.
I disagree that the NOT NULL UNIQUE is hackish but I do find strange that there's such a constraint on a name column.
精彩评论