开发者

How do you select values that are NOT a superset to a list?

开发者 https://www.devze.com 2022-12-07 22:05 出处:网络
I am trying to fetch values based on an allowlist and denylist. The table assigns multiple values to an id.

I am trying to fetch values based on an allowlist and denylist. The table assigns multiple values to an id. The allowlist requires all values to be met, so an allowlist of ['a', 'b'] would only return ids that have every item in the values column. This part of the query is done and works as e开发者_JS百科xpected.

For simplicity, I return every id where one of the values equals 'a'

/*

        Table        Query result
       ┌────┬────┐   ┌───┐
       │id  │val ├──►│id │
       ├────┼────┤   ├───┤
       │1   │ a  │   │1  │
       │1   │ c  │   │3  │
       │1   │ b  │   └───┘
       │2   │ b  │
       │3   │ a  │
       │3   │ c  │
       └────┴────┘
 */

For the next step, I want to create a denylist, where the previously filtered ids must not have a corresponding value. To accomplish this, I wish to return a row of values where each value x meets the following properties:

  • x does not occur in the allowlist
  • if ids with x were banned, the allowlist must still return at least one row.
  • x is distinct

However, I have trouble figuring out how to approach the denylist query. It makes little sense to return denylist options that would nullify allowlist values.

How should I approach this problem? How do I only return values that are NOT a superset to a?

Consider the following example of what the query output should be:

/*
        ...WHERE val
           IN ('a')...

         Table        Query result
        ┌────┬────┐   ┌────┬─────┐
        │id  │val ├──►│val │count│
        ├────┼────┤   ├────┼─────┤
        │1   │ a  │   │b   │ 2   │
        │1   │ c  │   │c   │ 2   │
        │1   │ b  │   └────┴─────┘
        │2   │ b  │
        │3   │ a  │
        │3   │ c  │
        └────┴────┘

        a is always accompanied by c. If you denylist c,
        a will never return, despite being on the allowlist.
        c is a pointless option for a denylist.

        if b is on the denylist, id 3 is still valid. B is a
        purposeful option.



        How do I return this instead?


        ...WHERE val
           IN ('a')...

        Table         Query result
       ┌────┬────┐    ┌────┬─────┐
       │id  │val ├───►│val │count│
       ├────┼────┤    ├────┼─────┤
       │1   │ a  │    │b   │ 2   │
       │1   │ c  │    └────┴─────┘
       │1   │ b  │
       │2   │ b  │
       │3   │ a  │
       │3   │ c  │
       └────┴────┘
 */


After some tinkering, I have found a solution.

Assuming the query is based on this structure:

SELECT
    `value`,
    COUNT(*) as count
FROM testTable
    WHERE `id` IN (
    SELECT `id` FROM testTable
        WHERE `value` IN ('a')
        GROUP BY `id`
        HAVING COUNT(DISTINCT testTable.`value`) = 1
    )
GROUP BY `value` ORDER BY count DESC;

It will return every value that shares the same id as the allowlist, along with its count. Since all values in the allowlist must occur in the result, their count will respectively be the highest:

┌─────┬─────┐
│value│count│
├─────┼─────┤
│a    │ 2   │
│b    │ 1   │
│c    │ 2   │
└─────┴─────┘

Therefore, if a different value has the same count, it occurs in all of the returning rows. One way to obtain the maximum value is a subquery with LIMIT 1:

SELECT
    COUNT(*) as count
FROM testTable
    WHERE `id` IN (
        SELECT `id` FROM testTable
            WHERE `value` IN ('a')
            GROUP BY `id`
            HAVING COUNT(DISTINCT testTable.`value`) = 1
    )
GROUP BY `value`
ORDER BY count DESC
LIMIT 1;

Since I want no values that exist in all entries, I compare with < using the HAVING clause.

The end result would be:

SELECT
    `value`,
    COUNT(*) as count
FROM testTable
    WHERE `id` IN (
    SELECT `id` FROM testTable
        WHERE `value` IN ('a')
        GROUP BY `id`
        HAVING COUNT(DISTINCT testTable.`value`) = 1
    )
GROUP BY `value`
HAVING count < (
        SELECT
            COUNT(*) as count
        FROM testTable
            WHERE `id` IN (
                SELECT `id` FROM testTable
                    WHERE `value` IN ('a')
                    GROUP BY `id`
                    HAVING COUNT(DISTINCT testTable.`value`) = 1
            )
        GROUP BY `value`
        ORDER BY count DESC
        LIMIT 1
)
ORDER BY count DESC;

This will return:

┌─────┬─────┐
│value│count│
├─────┼─────┤
│b    │ 1   │
└─────┴─────┘
0

精彩评论

暂无评论...
验证码 换一张
取 消