开发者

Strange Discrepancy In Queries Using UNIX_TIMESTAMP

开发者 https://www.devze.com 2023-02-20 08:44 出处:网络
Can someone please explain the discrepancy among the following queries: Query 1 (returns 87 results): SELECT userId, COUNT(userId) as usercount

Can someone please explain the discrepancy among the following queries:

Query 1 (returns 87 results):

SELECT userId, COUNT(userId) as usercount
FROM `cpnc_PaymentOrder` 
WHERE  created >= UNIX_TIMESTAMP('2011-03-01')
GROUP BY userId
HAVING usercount > 1

Query 2 (returns 177 results):

SELECT userId, COUNT(userId) as usercount
FROM `cpnc_PaymentOrder` 
WHERE  created >= UNIX_TIMESTAMP('2011-02-01')
GROUP BY userId
HAVING usercount > 1

Query 3 (returns 55 results):

SELECT userId, COUNT(userId) as usercount
FROM `cpnc_PaymentOrder` 
WHERE  created >= UNIX_TIMESTAMP('2011-02-01')
AND created < UNIX_TIMESTAMP('2011-03-01')
GROUP BY userId
HAVING usercount > 1

Now I would think that the number of results from Query 2 minus the number of results from Query 1 would equal the number of results from Query 3. But this is not the case. Can someone please explain why?

Thanks, Jonah

EDIT:

for clarification, the query i want to write is:

SELECT userId
FROM `cpnc_PaymentOrder` 
WHERE  created >= UNIX_TIMESTAMP('2011-02-01')
AND created < UNIX_TIMESTAMP('2011-03-01')
AND userId "appears in开发者_运维问答 at least one other record from before '2011-03-01'"


Because you're comparing number of groups.

Let's see this on a small example:

user_id | created
      1 | 2011-02-02
      1 | 2011-02-02
      1 | 2011-02-02
      2 | 2011-02-02
      2 | 2011-02-02
      1 | 2011-03-02
      1 | 2011-03-02
      2 | 2011-03-02
      2 | 2011-03-02

First query for this dataset will return 2 rows, second will return also 2 rows, 3rd will return also 2 rows.

Just remove your GROUP BY and see the difference (without GROUP BY the math will match, of course)


Query 1 has: WHERE created >= UNIX_TIMESTAMP('2011-03-01')
ie. date is larger than …

Query 3 has: WHERE created < UNIX_TIMESTAMP('2011-03-01')
ie. date is less than …

So, Query 3 is not the intersection of Queries 1 and 2.


 user_id | created
       1 | 2011-02-02
       1 | 2011-03-02
       1 | 2011-03-02

Query1 would be 1 result, Query2 also 1 result but Query 3 has zero results, because you are missing the people who appeared only once in that time.

0

精彩评论

暂无评论...
验证码 换一张
取 消