开发者

Complicated COUNT query in MySQL

开发者 https://www.devze.com 2023-03-17 03:17 出处:网络
I am trying to find the number of video credits a particular user has. The following are the three tables that are relevant:

I am trying to find the number of video credits a particular user has.

The following are the three tables that are relevant:

CREATE TABLE `userprofile_userprofile` (
  `id` int(11) NOT NULL AUTO_I开发者_如何学运维NCREMENT,
  `full_name` varchar(100) NOT NULL,
   ...
 )

CREATE TABLE `userprofile_videoinfo` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(256) NOT NULL,
  `uploaded_by_id` int(11) NOT NULL,
  ...
  KEY `userprofile_videoinfo_e43a31e7` (`uploaded_by_id`),
  CONSTRAINT `uploaded_by_id_refs_id_492ba9396be0968c` FOREIGN KEY (`uploaded_by_id`) REFERENCES `userprofile_userprofile` (`id`)
)

CREATE TABLE `userprofile_videocredit` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `video_id` int(11) NOT NULL,
  `profile_id` int(11) DEFAULT NULL,
  KEY `userprofile_videocredit_fa26288c` (`video_id`),
  KEY `userprofile_videocredit_141c6eec` (`profile_id`),
  CONSTRAINT `profile_id_refs_id_31fc4a6405dffd9f` FOREIGN KEY (`profile_id`) REFERENCES `userprofile_userprofile` (`id`),
  CONSTRAINT `video_id_refs_id_4dcff2eeed362a80` FOREIGN KEY (`video_id`) REFERENCES `userprofile_videoinfo` (`id`)
)

The videoinfo table is when a user uploads a video, he will get an "uploaded_by" listing. The videocredit table are all the credits for a given film. It is entirely independent of uploading the film (i.e., a video can be uploaded by a user without crediting himself, and a user can be credited in a video he has not uploaded).

In trying to find the COUNT of videos a user has been credited in, I want to find:

# videos a user has uploaded + # of non duplicate-video credits uploaded by others

By way of example: if a user uploads 5 videos called:

VideoME1, VideoME2, VideoME3, VideoME4, and VideoME5 
(total = 5 videos [`videoinfo.uploaded_by_id`])

And has the following video credits:

VideoME1 (4 credits - director, writer, editor, choreographer)
VideoME2 (1 credit)
VideoOTHER1 (2 credits - writer, editor)
VideoOTHER2 (1 credit - writer)
(total = 8 video credits [`videocredit.profile_id`])

The COUNT should be 5 (videos uploaded) + 2 (non-duplicate video credits uploaded by others) = 7. If a user has no video credits, it should = 0 (i.e., LEFT OUTER JOIN).

I've been able to figure out the COUNTS for each of the uploads/credits, but can't figure out how to combine the two and get rid of duplicates. What SQL do I need to do this? Thank you.

By the way, this is what I currently have for each (individual) COUNT:

mysql> SELECT full_name, v.video_id, COUNT(DISTINCT v.video_id) as cnt
    -> FROM userprofile_userprofile u LEFT OUTER JOIN userprofile_videocredit v
    -> ON u.id = V.profile_id
    -> GROUP BY full_name
    -> ORDER BY cnt DESC;

mysql> SELECT full_name, v.id, COUNT(v.uploaded_by_id) as cnt
    -> FROM userprofile_userprofile u LEFT OUTER JOIN userprofile_videoinfo v
    -> ON u.id = v.uploaded_by_id
    -> GROUP BY full_name
    -> ORDER BY cnt DESC;


X-Zero's suggestion of adding an "uploader credit" to the data is the best way to keep the query simple. If that's not an option, do an inner join between userprofile_videoinfo and userprofile_videocredit to make it easy to eliminate duplicates:

SELECT u.id, u.full_name, COUNT(DISTINCT v.video_id) as credit_count
FROM userprofile_userprofile u
LEFT JOIN (SELECT vi.video_id, vi.uploaded_by_id, vc.profile_id as credited_to_id
    FROM userprofile_videoinfo vi
    JOIN userprofile_videocredit vc ON vi.id = vc.video_id
    ) v ON u.id = v.uploaded_by_id OR u.id = v.credited_to_id
GROUP BY u.id, u.full_name
ORDER BY credit_count DESC

The subquery may be useful to create as a view.


There are basically 2 ways to do this:
1) Add 'uploader' as something they can be credited for, and add a trigger to autopopulate the entry - only one table to go for, etc.
2) I believe the following query should also work (which will also take care of your full_name problem):

WITH credit_rollup (profile_id) as (SELECT profile_id
                                    FROM userprofile_videocredit 
                                    GROUP BY profile_id, video_id)
SELECT full_name, COALESCE((SELECT count(*) 
                            FROM credit_rollup as v
                            WHERE v.profile_id = u.id), 0) +
                  COALESCE((SELECT count(*)
                            FROM userprofile_videoinfo as v
                            WHERE v.uploaded_by_id = u.id), 0) as credits
FROM userprofile_userprofile
ORDER by credits DESC

Although, you may wish to remove 'userprofile' from in front of every table name, and just put them in a schema with that name.


Edited query to only count 1 credit per video.


After reviewing the post one more time, it has become apparent that I missed a key use of the word 'duplicate' - in that the user was only credited once for uploading the video AND having some sort of credit in the video (director, editor, etc.), but not twice (which is an OR).
As such, the following query is more in line (and thanks to @simon for making me think about it):

WITH credit_rollup (uploaded_by_id, credited_to_id) 
                   AS (SELECT info.uploaded_by_id, credit.profile_id
                       FROM userprofile_videoinfo as info
                       JOIN userprofile_videocredit as credit
                       ON info.id = credit.video_id
                       GROUP BY info.uploaded_by_id, credit.profile_id)
SELECT usr.full_name, COALSECE((SELECT count(*)
                                 FROM credit_rollup as rollup
                                 WHERE rollup.uploaded_by_id = usr.id
                                 OR rollup.credited_to_id = usr.id), 0) as credits
FROM userprofile as usr
ORDER BY credits DESC

As @simon said, the CTE (he used a subquery) may be useful to create as a view (basically, it lists everybody once for their involvment in a video, a handy thing to have).


If I haven't made any mistake:

SELECT u.id
     , u.full_name
     , ( SELECT COUNT(*) 
         FROM userprofile_videoinfo vi
         WHERE u.id = vi.uploaded_by_id
       ) AS cnt_VideosUploadedBy                         <---- 5
    , cnt_Credits_InMyUploads 
        + cnt_Videos_CreditedIn - cnt_Videos_CreditedIn_and_UploadBy
      AS cnt_Difficult                               <---- 5 + 4 - 2 = 7
    , cnt_Credits_Total                                  <---- 8
    , cnt_Credits_InMyUploads                            <---- 5
    , cnt_Videos_CreditedIn                              <---- 4 
    , cnt_Videos_CreditedIn_and_UploadBy                 <---- 2
FROM userprofile_userprofile u 
  LEFT JOIN
      ( SELECT u.id
             , COUNT(vc.video_id)
               AS cnt_Credits_Total
             , (COUNT(vi.profile)
               AS cnt_Credits_InMyUploads
             , COUNT(DISTINCT vc.video_id)
               AS cnt_Videos_CreditedIn
             , (COUNT(DISTINCT vi.id)
               AS cnt_Videos_CreditedIn_and_UploadBy
        FROM userprofile_userprofile u 
          JOIN userprofile_videocredit vc
            ON u.id = vc.profile_by_id
          LEFT JOIN userprofile_videoinfo vi
            ON vc.video_id = vi.id
            AND vi.profile = u.id
        GROUP BY u.id
      ) AS grp
    ON grp.id = u.id
0

精彩评论

暂无评论...
验证码 换一张
取 消