开发者

Need advice on denormalizing a database that deals with responses to polls

开发者 https://www.devze.com 2022-12-20 11:54 出处:网络
My web app deals with polls (surveys).Right now I have 2 tables as part of database schema. polls id question

My web app deals with polls (surveys). Right now I have 2 tables as part of database schema.

polls
    id
    question
    choices (ex: yes,no,maybe)
    created

polls_responses
    poll_id
    user_id
    tracker_id
    response

The problem with this is that on some polls I have alot of responses (>1000). People can view the results of the polls and it will show how many users voted for yes, no, or maybe and how many anonymous users voted yes, no, or maybe. The problem with this is that whenever a user views the results of the poll, it has to loop through all the responses and count the total # of responses, # of responses for each choice, # of responses for each choice made by users, and # of responses for each choice made by tracker_id (anonymous users) and calculate percentages and display it in a bar graph. This makes the page load really slow. I was thinking of denormalizing the database to increase performance so that we have something like this

polls
    id
    question
    choices (ex: yes,no,maybe)
    total_responses (ex: 10,3,3,4)
    user_responses (ex: 5,2,2,1)
    anon_responses (ex: 5,1,3,1)
    created

polls_responses
    poll_id
    user_id
    tracker_id
    response

That is, for the value in total_responses, the 10 is the total, 3 is the # of responses for choice yes, 3 is the # of responses for choice no, and 4 is the # of responses for choice maybe. The same format applies to user_responses and anon_r开发者_StackOverflowesponses fields. If I could get some opinions on this method, it would be greatly appreciated! Thanks for your time.

EDIT: I'm using MySQL


Firstly, I do not think you need to loop to count this.

Have a look at something like this

SELECT  poll_id,
        COUNT(response) Total,
        SUM(CASE WHEN response = 'Y' THEN 1 ELSE 0 END) TotalYes,
        SUM(CASE WHEN response = 'N' THEN 1 ELSE 0 END) TotalNo,
        SUM(CASE WHEN response = 'M' THEN 1 ELSE 0 END) TotalMaybe,
        SUM(CASE WHEN [user_id] IS NOT NULL AND response = 'Y' THEN 1 ELSE 0 END) UserYes,
        SUM(CASE WHEN [user_id] IS NOT NULL AND response = 'N' THEN 1 ELSE 0 END) UserNo,
        SUM(CASE WHEN [user_id] IS NOT NULL AND response = 'M' THEN 1 ELSE 0 END) UserMaybe,
        SUM(CASE WHEN tracker_id IS NOT NULL AND response = 'Y' THEN 1 ELSE 0 END) TrackerYes,
        SUM(CASE WHEN tracker_id IS NOT NULL AND response = 'N' THEN 1 ELSE 0 END) TrackerNo,
        SUM(CASE WHEN tracker_id IS NOT NULL AND response = 'M' THEN 1 ELSE 0 END) TrackerMaybe
FROM    polls_responses
GROUP BY poll_id

This should get you the respective results per poll_id, from where you can then join back to the table to retrieve the poll details.


Don't be afraid to use the database to do some heavy lifting for you. You don't have to loop through all the responses in your output.

You haven't specified your database platform, but you could probably handle this in the database using some joins and aggregate functions, or some sub-selects.

0

精彩评论

暂无评论...
验证码 换一张
取 消