I h开发者_StackOverflowave tables setup as such:
A message is sent out to a group of users.This message is put in the parent_message
table
id | sender_id | date
each message that is sent in that group is put in the child_message
table
id | parent_id | message | date_sent
when a reply is received it is put into the reply_message
table
id | child_id | message | date_received
.
Now I have a few questions about this setup.
1) Every time the page is loaded I need to show how many child messages each parent message has. Would you add a column to the parent_message table called child_count or work it out in your query. why, why not?
Example query
select *,
count(select parent_id from child_message c where c.parent_id = p.parent_id ) child_count
from parent_message;
2) If the user chooses they can view all reply messages to a parent message.
would you add the parent_id to the reply reply_message table or work it out in your query? Why, why not?Example query
select * from reply_message
where child_id in(select id from child_message where parent_id = '66')
I'd say it very much depends on the amount of messages. If you have like a million messages in the system, a join to child_message can become very expensive. In that case adding a child_count to the parent table can be beneficial for your performance. Same goes for your second use case. Of course that is some de-normalization of your data, so if your system allows for reshaping topics and replies (like splitting a topic) you have to do additional bookkeeping in that case.
Another approach would be creating index tables, which hold the information you need and update them offline in an asynchronous way, if you don't need the information to be 100% accurate all the time e.g.
table message_counts (parent_id, child_count)
And then schedule updates on these when a new message is added to the system, e.g. by using a trigger.
So bottom line, unless you encounter performance issues, keep your tables normalized, just like they are. When you expect millions of messages and replies, some de-normalization can help speed up things. Index tables can help creating aggregated statistics offline, unless you need them to be accurate and up-to-date.
You're probably better off working it out in both cases but I would rewrite the queries
SELECT
p.*
count(child.*) childCount
FROM
parent_message p
LEFT JOIN child_message c
on c.parent_id = p.parent_id
and
SELECT DISTINCT
rm.*
FROM
reply_message rm
INNER JOIN child_message cm
rm.child_id = cm.id
WHERE
parent_id = '66'
I also would list the fields instead of doing the SELECT *
精彩评论