开发者

Select query within a select query?

开发者 https://www.devze.com 2023-04-12 20:39 出处:网络
So I want to compare new users to returning users in a table by month.I have a table that contains each action with a username and a date stamp.

So I want to compare new users to returning users in a table by month. I have a table that contains each action with a username and a date stamp.

I can easily pull users that performed an action i开发者_JS百科n, for example, January 2011. To see see if each user is new I need to then run their username against all previous records (prior to January 2011).

In my fumblings I came up with the following:

  SELECT ini.username,
         MIN(ini.datetime) AS firstAction,
         COUNT(ini.datetime) AS numMonth,
         (SELECT COUNT(*) 
            FROM tableActions tot
           WHERE tot.username = ini.username
             AND tot.datetime < '201101%' 
             AND tot.datetime > '201001%') AS numTotal
    FROM tableActions ini
   WHERE DATETIME >= '201101%' 
     AND DATETIME < '201102%'
GROUP BY ini.username
ORDER BY firstAction

It doesn't error, but it doesn't finish either. Seems to be quite intense.


You can re-write the query to be (assuming tableactions.datetime is a DATETIME data type):

   SELECT ini.username,
          MIN(ini.datetime) AS firstAction,
          COUNT(ini.datetime) AS numMonth,
          x.numTotal
     FROM tableActions ini
LEFT JOIN (SELECT tot.username,
                  COUNT(*) AS numTotal
             FROM tableActions tot
            WHERE tot.datetime > '2010-01-01'
              AND tot.datetime < '2011-01-01'
         GROUP BY tot.username) x ON x.username = ini.username
    WHERE ini.datetime BETWEEN '2011-01-01' AND '2011-01-31'
 GROUP BY ini.username
 ORDER BY firstAction

Might help to have an index on username at a minimum, though a covering index using username, datetime is worth considering.

The datetime comparison looks suspect - LIKE is the only to support wildcards.


I think a simple table-to-itself join with a suitable where clause will be sufficient (this query is straight from my head, not tested):

SELECT    curr_activity.username, COUNT(prev_activity.username) AS did_something_in_the_past
FROM      tableActions AS curr_activity
LEFT JOIN tableActions AS prev_activity ON curr_activity.username = prev_activity.username 
WHERE     curr_activity.datetime >= '2011-01-01' AND curr_activity.datetime < '2011-02-01'
AND       prev_activity.datetime <  '2011-01-01' 
GROUP BY  curr_activity.username

Indexes do matter. You must index the username and datetime column, and the datetime column must be a datetime or a similar data type.


SELECT username,
MIN(datetime) AS firstAction,
MAX(datetime) AS numMonth,
COUNT(*) AS numTotal
GROUP BY ini.username
HAVING numTotal > 1 
WHERE DATETIME between '201001%' AND '201102%'
ORDER BY username
* I think this collapsed version is what you need ?  


I think you can replace

SELECT COUNT(*) 
        FROM tableActions tot
       WHERE tot.username = ini.username
         AND tot.datetime < '201101%' 
         AND tot.datetime > '201001%'

with

SELECT 1
        FROM tableActions tot
       WHERE tot.username = ini.username
         AND tot.datetime < '201101%' 
         AND tot.datetime > '201001%' LIMIT 1

, so it does not have to loop through all the records and count them.

0

精彩评论

暂无评论...
验证码 换一张
取 消