开发者

Associating tags with entities, store the tag once or not?

开发者 https://www.devze.com 2022-12-26 08:04 出处:网络
Consider I have a table with notes which could be associated with zero or more tags, how would I decide to do

Consider I have a table with notes which could be associated with zero or more tags, how would I decide to do

create table no开发者_JS百科tes (
  id int , -- primary key
  -- other fields
);
create table tagmapping (
 noteid int, -- refers notes.id
 tagid int, -- refers tags.id
 );
create table tags (
  int id, -- primary key
  tagname varchar(255)
);

vs storing the same tag potentially several times as in

create table notes (
  int id, -- primary key
  -- other fields

);

create table bar (
  id id, -- primary key
  tag varchar(255),
  -- other fields
  noteid int -- refers to notes.id, (not unique)
);

What mess/advantages would I get myself into going for the last approach ?


The first way is the standard way of implementing a many-to-many relationship. You should do this because this is what will surprise people the least.

A couple of issues with the second method:

  • You will have to duplicate the tag names, which introduces redundancy. It could be a waste of disk space if the tag names are long.
  • If you ever want to rename a tag, you have to update the entire table instead of just one field.


For the second approach:

  1. It's be harder to query against (for reports and such) since the tags can be potentially misspelled and other things like that
  2. Speed. It's faster to query against an int than it is a string
  3. What if you need to rename a tag? You'll have to rename it in each and every note either manually or via a series of queries

  4. Storing duplicate data is generally considered a bad practice among DBA's and developer's. It's best always follow the normalization rules when you can


The disadvantages of the second solution will in part rear themselves when you need to alter a tag or add attributes to tags. Suppose you have a tag named "Foo" and you now wish to add an attribute that would allow you to determine whether the tag is active or not. There is no means to do that in the second approach. You would have to enforce "magic tag names" in the presentation tier. In addition, you have no means of enforcing consistent naming on the tags. Someone could add a tag called "Foo" and another call "Fu" even though they are supposed to be the same tag.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号