Select duplicates/repeating values, ungrouped_问答_开发者

Select duplicates/repeating values, ungrouped

开发者 https://www.devze.com 2023-03-07 00:25 出处：网络

I am trying to return a list of ungrouped values having a max defined number of repeating values.I have a list of values retrieved as such:

相关专题：duplicates

I am trying to return a list of ungrouped values having a max defined number of repeating values. I have a list of values retrieved as such:

select TagDirID from tags where id = '550'

The results would be:

As you can see there are 4 repeats of 65454. I would like to return a list that has a user defined max开发者_C百科 number of repeats for a TagDirID. For instance, selecting only 3 repeats or less:

All the methods I've found return a grouped list, i would like to maintain the individual items. Is this something that can be done in a query? There is a primary key available, TagID.

edit: what this does is select all the tags for an item, id = '550'. because it's user content sometimes people tag the same things multiple times and i'm trying to cut down the duplicates i show.

edit 2: so, while the accepted answer worked for me i found it was a little too slow for what i needed so i came up with a php solution:

function get_tags($ID = '', $tags_to_keep = 3)
{   
    // Select all tags.
    $query = "select TagDirID, Tag from tags where id = '$ID'";

    $tags_result = mysql_query($query); 

    $num_results = mysql_num_rows($tags_result);

    for ($i=0; $i<$num_results; $i++)
    {
        //Get tag topics
        $tags_row = mysql_fetch_array($tags_result);

        //build array of items already found with counts
        $tags_count = array_count_values($tags_filter);

        //if number of tags already found($tags_count/$tags_filter) is less than or equal to tags_to_keep then add to filtered array and return array.
        if($tags_count[$tags_row['TagDirID']] <= $tags_to_keep)
        {
            $tags_filter[$i] = $tags_row['TagDirID'];
            $tags[$i] = $tags_row['Tag'];
        }           
    }

    return $tags;
}

Select TagID, TagDirID
From    (
        Select T1.TagID, T1.TagDirID
             , (Select Count(*)
                From tags As T2
                Where T2.TagDirID = T1.TagDirID
                    And T2.TagID < T1.TagID) As Rnk
        Where T1.id = '550'
        From tags As T1
        ) As T
Where T.Rnk < 3

Another way of writing the same query:

Select TagID, TagDirID
From    (
        Select T1.TagID, T1.TagDirID, Count( T2.TagID ) As Rnk
        From tags As T1
            Left Join tags As T2
                On T2.TagDirID = T1.TagDirID
                    And T2.TagID < T1.TagID
        Where T1.id = '550'
        Group By T1.TagID, T1.TagDirID
        ) As T
Where T.Rnk < 3

The approach here is to mimic a ranking function which would sequence the rows for each grouping of TagDirID. Thus, the inner query in either of the two above solutions should give you something like:

TagID | TagDirID | Rnk
1     | 9508     | 0
2     | 10382    | 0
3     | 10672    | 0
4     | 65454    | 0
5     | 65454    | 1
6     | 65454    | 2
7     | 65454    | 3

With the rows numbered within each grouping, we can now filter down our results so that we only get a maximum number of rows in any given group. The ISO/ANSI solution would be to use the ROW_NUMBER ranking function which isn't yet supported by MySQL.

As the repeated values are all equal, you can query for the number of repetions of distinct items. It would be something like this, for 3 items or less:

SELECT T.tdid, T.cnt
FROM ( 
      SELECT distinct(TagDirID) as tdid, 
             (SELECT COUNT(*) FROM tags WHERE id = tdid) as cnt 
      FROM tags
) as T
WHERE T.tdid = '550' AND T.cnt < 3;

So the result would be a little different from what you had (not the duplicate items, but one item and the number of duplications), but I think it will do.