Advantages of modeling data in one table versus using two tables_问答_开发者

Advantages of modeling data in one table versus using two tables

开发者 https://www.devze.com 2023-01-27 03:32 出处：网络

Assuming you were modeling a Q&A database using MySQL, I am aware of two ways to approach the model architecture:

Create a single table for questions and answers with a "typeId"

Create two separate tables; one for questions and one for answers

Can anyone elaborate on the advantages and disadvantages of both approaches, and why you would use one approach over the other?

My own observations:

Approach 2 is more normalized
Approach 2 requires two "comments" tables for Q's and A's or a single table with composite PKs; (Q's & A's may identical IDs)
Appro开发者_StackOverflowach 1 can become very complicated with self joins and so on

The specific of the design would really depend of your requirements and what you want to achieve and how huge your database would be.

1-table approach: You may be able to use a single table in the case where you only provide/allow one answer per question (à la FAQ), where you would only have id,question,answer fields and questions are not added to DB until answer is given, or update the row when answer is available.

2-table approch: As soon as there may be more than one answer/comment per question. I could choose a model a little bit different than @Spredzy's as I would just include everything just like "emails": message_id, in_reply_to, timestamp, text for simplicity. This simplicity will not allow you to tag specific (answers VS comments unless only one answer and in_reply_to answer becomes comments like on SO). Questions are those with in_reply_to IS NULL.

3/more-table approach: If you really want performance by having FIXED-ROW length on the main table and don't need to display excerpt of questions and answers, but only want to know numbers. You would separate the text, any attachments, etc. Or just because you would want to avoid self joins as suggested by @orangepips: "Finally, self joins suck and present an excellent way to kill performance.") and have a separate tables for everything.

Model this as two tables. Questions can have more than one Answer. Create separate Comment tables for Questions and Answers; most likely use case I imagine does not see the comment data intermingling in a single DML statement.

A single table distinguished by a type column might make sense if you were representing an object model's inheritance, but that's not the case here. In addition, the intent of the table is muddied for anyone who reviews the schema because they'd need to know the enumerated possibilities for the type; could be a lookup table I supposed, but for two possibilities - and no more - seems a waste.

Finally, self joins suck and present an excellent way to kill performance.

I Would create 2 tables :

One that represents Question, Answer and Comment. IIf you look carefully they have the same core data : user_id, text, date, plus a type_id field and all the other field you might need.

The other table would be a pretty simple table : type

type_id   type_desc
xxx-x-xx  question
xxx-x-xx  answer
xxx-x-xx  comment

By doing that, your model will be highly scalable, faster with no duplication of data (normalization).

Finally, technically talking to get all the question or all the answer of one question it is just a simple join.

Hope it could help,

One table per type of data. If questions and answers are identical (as if objects in OOP), one table suffices. If not, not.

A single comment table with composite PK's is right because the comments are still of one type of object: Comment. The fact that they can reference both Q's and A's doesn't affect that.