开发者

Should I make another table or just use arrays? (To normalize or not to normalize)

开发者 https://www.devze.com 2022-12-10 12:27 出处:网络
The current situation is that topics are sorted by 3 main categories.There is the potential to add more than just the 3 categories but the higher ups wish to implement the ability to add more than jus

The current situation is that topics are sorted by 3 main categories. There is the potential to add more than just the 3 categories but the higher ups wish to implement the ability to add more than just 1 category to a topic.

My original db design has the the categoryID as a Foreign Key in the topic info table. Which was probably a bad idea from the start but I figured they were set on only having 3 categories and doing it this way would allow for fewer queries.

So from what I can see I have two options now: 1) Input the categoryID as a comma separated string that I parse on the php end. 2) Restructure the DB and pull out the categoryID into its own table of categoryID and topicID.

I was wondering what everyone thought of this. My first instinct was to restructure the database. But the first option when I think about it is the 开发者_C百科easiest to implement and least likely to break something existing by changing the db around. This can also lead to de-normalization however and open up the possibility of inconsistent data.

I have read the de-normalizing is fine so long as you accept the risk of having inconsistent data in exchange for performance. In your opinion will I gain much in performance for this risk? Any input on what I should do in this situation would be appreciated.

Thanks for the help,

Levi


Do not confuse denormalization (a good example of which is keeping the number of votes on SO question together with question as opposed to calculating it each time from 'votes' table) with the abomination that is comma-separated list of ids.

Model a proper many-to-many relationship; there are just so many things that can (and will) go wrong with comma-separated approach. To name a few:

  1. No referential integrity.
  2. Next to impossible to use in joins.
  3. Impossible to adequately index; non-scalable.


Your best option would be to have a database, like you said, of categoryID-topicID pairs to find which categories the topics belong to.

You COULD do it the other way by exploding the strings in categoryID, but when you search for any topics that are in a certain category, you'll have to run through each field and run a LIKE on it... Much more resource intensive.

Take the time to restructure the DB and you'll end up with a much better result.


If you need to do something in the DBMS with the individual items, do not store them in list form. This will make your queries run like a dog as your tables get bigger. Of course, if you're only ever going to treat the list as a unit, it's okay to store them that way.

But you'd better be sure you're going to always treat the list as a unit, and no cheating, saying they're a unit and then spiltting them apart somewhere else - better to let the DBMS do that for you.

You should always do 3NF first then if, and only if, you have performance problems, denormalize for speed.

Those fields you're talking about in the question are not the sort that you will be treating as a unit. You will need to do things to the individual elements in the lists, so they should be broken out into another table.

0

精彩评论

暂无评论...
验证码 换一张
取 消