开发者

mysql: design practices

开发者 https://www.devze.com 2023-02-05 15:00 出处:网络
I h开发者_StackOverflowave tables setup as such: A message is sent out to a group of users. This message is put in the parent_message table

I h开发者_StackOverflowave tables setup as such:

A message is sent out to a group of users.

This message is put in the parent_message table

This table contains id | sender_id | date

each message that is sent in that group is put in the child_message table

this table contains id | parent_id | message | date_sent

when a reply is received it is put into the reply_message table

this table contains id | child_id | message | date_received.

Now I have a few questions about this setup.


1) Every time the page is loaded I need to show how many child messages each parent message has.

Would you add a column to the parent_message table called child_count or work it out in your query.

why, why not?

Example query

select *, 
count(select parent_id from child_message c where c.parent_id = p.parent_id ) child_count 
from parent_message;

2) If the user chooses they can view all reply messages to a parent message.

would you add the parent_id to the reply reply_message table or work it out in your query?

Why, why not?

Example query

select * from reply_message 
where child_id in(select id from child_message where parent_id = '66')


I'd say it very much depends on the amount of messages. If you have like a million messages in the system, a join to child_message can become very expensive. In that case adding a child_count to the parent table can be beneficial for your performance. Same goes for your second use case. Of course that is some de-normalization of your data, so if your system allows for reshaping topics and replies (like splitting a topic) you have to do additional bookkeeping in that case.

Another approach would be creating index tables, which hold the information you need and update them offline in an asynchronous way, if you don't need the information to be 100% accurate all the time e.g.

table message_counts (parent_id, child_count)

And then schedule updates on these when a new message is added to the system, e.g. by using a trigger.

So bottom line, unless you encounter performance issues, keep your tables normalized, just like they are. When you expect millions of messages and replies, some de-normalization can help speed up things. Index tables can help creating aggregated statistics offline, unless you need them to be accurate and up-to-date.


You're probably better off working it out in both cases but I would rewrite the queries

SELECT 
   p.*
   count(child.*) childCount
FROM 
   parent_message p
   LEFT JOIN  child_message  c
   on c.parent_id = p.parent_id

and

SELECT DISTINCT
       rm.*
    FROM 
        reply_message rm
        INNER JOIN child_message  cm
        rm.child_id = cm.id 
   WHERE
       parent_id = '66'

I also would list the fields instead of doing the SELECT *

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号