While I've been writing php for a long time, it was always a skill I learnt myself and I've just hit a minor crisis of confidence over table joins!
Assuming a MySQL db auth
containing MyISAM tables users
and permissions
:
users
- id (auto increment)
- email
permissions
- id (auto increment)
- name
To join these tables in a many-to-many (or one-to-many), I've always used bridge tables like so:
user_permissions
- id (auto increment)
- user_id
- permission_id
(I know innoDB is capable of relationships, but it's also more complex and hogs more memory, so for the purpose of the q I'd like to stay with myISAM)
My particular question is this: is it wise to join the tables using the auto-incremented key, or should I be generating my own additional key?
I'm aware that problems could occur if a table gets corrupted and I have to rebuild or if I begin mirroring to two dbs and the keys get out-of-sync.
I'm also aware that if I 开发者_StackOverflowgenerate a unique hash for each row (to be used in joins) then there is the overhead of generating a hash and checking that it is new before every data insert.
How does everyone else do it? Are these issues things you have seen in practical situations?
Thanks for your time!
Adam
The join for permission and users would go like:
SELECT u.*, p.*
FROM user_permissions up
INNER JOIN users u
ON up.user_id = u.id
INNER JOIN permissions p
ON up.permission_id = p.id
The id
column in user_permission is not really necessary, unless you want to refer to this from another table. If you leave that id out, (user_id, permission_id) would be the primary key (no auto incrementing going on in this case). If you have he separate id column then you should definitely add UNIQUE constraint over (user_id, permission_id)
This is generally a decision between using surrogate keys (made-up or automatically assigned values that have no meaning beyond being unique), and natural keys (keys that have some semantic meaning with respect to the entity being stored, like a name, or bank account #).
in your case, it's not clear there is any natural key that would work (every aspect of the user table could change - name change after marriage, etc.). you would want a natural key if there's no possibility of the key ever morphing while still keeping the relationships. an example would be something like element names (Au, He, Es
, etc.) in the periodic table of elements (which are unlikely to change aside from adding new ones, but hey, anything could happen...).
as for data corruption, backups are really the best protection because anything can get corrupted. and in typical operations, you can always preserve the primary key whenever migrating or synchronizing. using an elaborately complex surrogate key instead of the auto increment provided by the db would be more risky because it complicates inserts (you have to ensure its unique, and possibly handle collisions in a transaction safe way)... and indeed an elaborately generated surrogate key wouldn't help in the case of data corruption (no way to correlate it with the record).
a natural PK on say the user's email would be a lot of overhead - have to update every relationship whenever the email changes, swapping email addresses between 2 accounts is complicated, indexes are much less efficient on wide values instead of ints. the trade-offs tilt in favor of the auto increment key.
a good writeup is here:
http://decipherinfosys.wordpress.com/2007/02/01/surrogate-keys-vs-natural-keys-for-primary-key/
精彩评论