开发者

Count values from one table with 2 WHERE conditions

开发者 https://www.devze.com 2023-03-02 04:25 出处:网络
I wonder what is better for performance and programming style for a simple thing: get the count 开发者_运维知识库for 2 values from one table (all below queries do the same job).

I wonder what is better for performance and programming style for a simple thing: get the count 开发者_运维知识库for 2 values from one table (all below queries do the same job).

Make 2 single queries:

SELECT count(*) FROM `a` WHERE  categories_id=2
SELECT count(*) FROM `a` WHERE  group_id=92

or use subquery

SELECT (SELECT count(*) FROM `a` WHERE  categories_id=2 AS categories)
,(SELECT count(*) FROM `a` WHERE  group_id=92) AS groups)

or union

SELECT count(*) FROM `a` WHERE  categories_id=2  
UNION 
SELECT count(*) FROM `a` WHERE  group_id=92


The main difference between the three is the handling of the result values, though that is not traumatic.

  1. The first example returns the two values in two separate fetch operations (on separate statements).
  2. The second example returns the two values as part of a single fetch operation.
  3. The third example returns the two values in two separate fetch operations (on the same statement).

Performance-wise, with just two rows of data, there is very little to choose between the three. The second (two sub-query) solution does the most with a single statement, and only requires a single fetch operation, so it might be the quickest. The first requires separate parsing of two statements, plus two sets of operations, so it should be the slowest. But whether you can truly measure that depends on lots of factors. If the client is in Australia and the server is in Europe, then the round-trip latency is likely to mean that the second or third solution is best (and the difference may depend on whether the DBMS returns multiple rows with a single client-server message exchange). If the client is on the same machine as the server, then the round-trip latency is much less critical.

For ease of understanding, the UNION version is probably sufficiently clean; it won't confuse anyone reading it. The first version might be slightly cleaner (one keyword less) but the difference is minimal.

If the number of alternatives increases (more than one group_id value, or more than one categories_value), then I think the UNION wins on clarity:

SELECT 'G' AS type, group_id, COUNT(*)
  FROM a
 WHERE group_id IN (92, 104, 137, 291)
 GROUP BY type, group_id
UNION
SELECT 'C' AS type, categories_id, COUNT(*)
  FROM a
 WHERE categories_id IN (2, 3, 13, 17, 19, 21)
 GROUP BY type, categories_id

The 'type' column allows you to distinguish between a group ID and a category ID that share the same ID number (albeit that they are two different sorts of ID).

Because it is easier to expand, I'd probably go with option 3 (UNION) unless there was compelling timing experiments on live data to show that option 2 (sub-queries) was in fact quicker.


The first option, doing two SELECTs, will always be slightly less efficient as it involves an extra round trip to the database. Between the second two, the union version will in theory be ever so slightly slower as the UNION will cause the database to have to sort the values and make the union. In practice, and for only two values, this isn't going be measurable against the time doing the two main parts of the query.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号