开发者

Set-theory way of determining a time difference between a "group" of records via SQL

开发者 https://www.devze.com 2022-12-09 23:40 出处:网络
I have a log file in SqlServer that stores the time an application started, the time the application is ready (i.e. finished loading), and the time that it\'s exited. Each of these occur as a separate

I have a log file in SqlServer that stores the time an application started, the time the application is ready (i.e. finished loading), and the time that it's exited. Each of these occur as a separate entry. The format (and sample data) is as follows:

Date/Time                  User    Type    Application  Message
2009-11-03 12:26:12.403 uname1  Info    app1         Started    
2009-11-03 12:26:22.403 uname1  Info    app1         Loaded 
2009-11-03 12:27:15.403 uname2  Info    app1         Started    
2009-11-03 12:27:16.401 uname1  Info    app1         Exited 
2009-11-03 12:27:18.403 uname2  Info    app1         Loaded 
2009-11-03 12:29:12.403 uname2  Info    app1         Exited 

I would like to find out, per application and per user, the amount of time it took for the application to get to a ready state and the amount of time the application was running. This would be a piece of cake if each date/time was in the same record, and it would also be easy (though tedious) to just load up each record as a cursor and sift through the dat开发者_StackOverflow社区a, but I thought that there must be some way to do this "correctly" in a set-theory manner.

So, to reiterate, the following output (from the sample data from above) would be expected (numbers are in seconds, rounded up):

User    Application    Ready    Uptime
uname1  app1           10       64
uname2  app1           3        117

Any suggestions?

EDIT: The good news is that the application can only be started once. BUT, the log does not take into account if the application crashed (though I suppose I could look for "exited" and "crashed" as final conditions).


I'd prefer to not join on the same table multiple times, especially if the table gets huge. This is kind of a two-pass approach. The first pass sorts the times into the right places, and the second pass folds them up by user and application:

SELECT 
  User,
  Application,
  MAX(StartTime) StartTime,
  MAX(ReadyTime) ReadyTime,
  MAX(ExitTime) ExitTime,
FROM (
  SELECT
    User,
    Application,
    CASE (
      WHEN Message = 'Started' THEN Date/Time
      ELSE NULL
    ) StartTime,
    CASE (
      WHEN Message = 'Loaded' THEN Date/Time
      ELSE NULL
    ) ReadyTime,
    CASE (
      WHEN Message = 'Exited' THEN Date/Time
      ELSE NULL
    ) ExitTime
  FROM Log
) Log
GROUP BY 
  User,
  Application

And from there it's trivial to calculate all you want on those different times.

It's not very "set-theory"-like, but grouping and aggregation never are. Like Eric's solution, it doesn't handle the situation when the same user has used the application multiple times. You'd need a third grouping column (like "session" or something) to handle that scenario.


How about

  Select S.user, S.Application,
        S.DateTime Started, L.DateTime Loaded, X.DateTime Exited,
        L.DateTime - S.DateTime LoadTime,
        X.DateTime - L.DateTime RunTime
  From LogFile S
      Full Join LogFile L
          On S.Message = 'Started'
             And L.Message = 'Loaded'
             And L.User = S.user
             And L.Application = S.Application
             And L.DateTime = (Select Min(DateTime)
                               From LogFile 
                               Where Message = 'Loaded'
                                 And application = S.Application
                                 And user = S.user
                                 And DateTime > S.DateTime)
      Full Join LogFile X
          On L.Message = 'Loaded'
             And X.Message = 'Exited'
             And X.User = L.user
             And X.Application = L.Application
             And  X.DateTime = (Select Min(DateTime)
                                From LogFile 
                                Where Message = 'Exited'
                                  And application = L.Application
                                  And user = L.user
                                  And DateTime > L.DateTime)

Then apply aggregate functions to this for what you want:

 Select user, Application, 
    Sum(LoadTime) TotLoadTime,
    Sum(RunTime) TotalRunTime
 From
      (Select S.user, S.Application,
        S.DateTime Started, L.DateTime Loaded, X.DateTime Exited,
        L.DateTime - S.DateTime LoadTime,
        X.DateTime - L.DateTime RunTime
       From LogFile S
          Full Join LogFile L
             On S.Message = 'Started'
                And X.Message = 'Loaded'
                 And L.User = S.user
                 And L.Application = S.Application
                 And L.DateTime =
                        (Select Min(DateTime)
                         From LogFile 
                         Where Message = 'Loaded'
                            And application = S.Application
                            And user = S.user
                            And DateTime > S.DateTime)
         Full Join LogFile X
             On L.Message = 'Loaded'
               And X.Message = 'Exited'
               And X.User = L.user
               And X.Application = L.Application
               And  X.DateTime =
                        (Select Min(DateTime)
                         From LogFile 
                         Where Message = 'Exited'
                             And application = L.Application
                             And user = L.user
                             And DateTime > L.DateTime)) Z
 Group By user, Application
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号