(Apologies for the title of this question - I wasn't overly sure how to explain it)
Not sure whether this can be done in SQL. Below is a (somewhat truncated) sample of an event log table.
EVENT ID DATE TIME
--------- ---------- -------- ----
ONE_THING 0006241800 20091109 1719
ONE_THING 0006944800 20091109 1720
ANOTHER 0007517110 20091109 1721
ANOTHER 0007214240 20091109 1721
ANOTHER 0006907900 20091109 1725
ANOTHER 0006501580 20091109 1727
ONE_THING 0006944800 20091109 1737
ANOTHER 0005749820 20091109 1737
ANOTHER 0006810500 20091109 1738
ANOTHER 0007481970 20091109 1738
ANOTHER 0006331740 20091109 1739
ANOTHER 0007253840 20091109 1739
ANOTHER 0006929280 20091109 1747
ANOTHER 0007297950 20091109 1749
ANOTHER 0005055560 20091109 1751
ANOTHER 0006092320 20091109 1751
ONE_THING 0001668720 20091109 1753
ONE_THING 0007218000 20091109 1754
I'm looking to find groups of instances of ANOTHER where no other event has occurred in the group and for time periods in excess of 2 minutes.
So, in the above dataset, the first group would be:
ANOTHER 0007517110 20091109 1721
ANOTHER 0007214240 20091109 1721
ANOTHER 0006907900 20091109 1725
ANOTHER 0006501580 20091109 1727
and the second would be:
ANOTHER 0005749820 20091109 1737
ANOTHER 0006810500 20091109 1738
ANOTHER 0007481970 20091109 1738
ANOTHER 0006331740 20091109 1739
ANOTHER 0007253840 20091109 1739
ANOTHER 0006929280 20091109 1747
ANOTHER 0007297950 20091109 1749
ANOTHER 0005055560 20091109 1751
ANOTHER 0006092320 20091109 1751
And ideally I'd like to get:
ANOTHER 0007517110 20091109 1721
ANOTHER 0006501580 20091109 1727
and:
ANOTHER 0005749820 20091109 1737
ANOTHER 0006092320 20091109 1751
Or even better still:
EVENT 开发者_运维技巧 DATE TIME_START TIME_END
--------- -------- ---------- --------
ANOTHER 20091109 1721 1727
ANOTHER 20091109 1737 1751
I thought about comparing rows but maybe there's a better way? I'd appreciate any tips for this. The solution just needs to work - it doesn't need to be necessarily fancy or elegant.
PS> I'm using Oracle.
this should work:
SQL> SELECT event, MIN(dt), MAX(dt) FROM (
2 SELECT event, dt,
3 SUM(discontinuity) over(ORDER BY dt, event) continuous_group
4 FROM (SELECT event, dt,
5 CASE
6 WHEN lag(event) over(ORDER BY dt, event) = event THEN
7 0
8 ELSE
9 1
10 END discontinuity
11 FROM DATA)
12 )
13 WHERE event = 'ANOTHER'
14 GROUP BY event, continuous_group;
EVENT MIN(DT) MAX(DT)
--------- ------------- -------------
ANOTHER 20091109 1738 20091109 1751
ANOTHER 20091109 1721 20091109 1737
Note: the events at 17:37 are synchronous and my query put the ANOTHER event into the first set arbitrarily. You can control this behaviour with the ORDER BY clause of the analytic function.
This is extending Vincent's answer to include the requirement that the group must be at least 2 minutes in length:
select event, tm_start, tm_stop
from (select event, min(when) tm_start, max(when) tm_stop
from (select event,
when,
sum(discontinuity) over(order by when, event) continuous_group
from (select event,
when,
case
when lag(event)
over(order by when, event) = event then
0
else
1
end discontinuity
from temp_stack ts))
where event = 'ANOTHER'
group by event, continuous_group)
where tm_stop - numtodsinterval(2, 'MINUTE') > tm_start;
SELECT *
FROM (
SELECT m.*, LEAD(event) OVER (ORDER BY date, time) AS ne, LAG(event) OVER (ORDER BY date, time) AS pe
FROM mytable m
)
WHERE event = 'ANOTHER'
AND (ne <> event OR pe <> event)
This should get you started. Note that if you have multiple events with the same date and time (as you do in your sample) this is non-deterministic. You could add ID to the ORDER BY clauses as a tiebreaker if you think that makes sense.
SELECT * FROM (
SELECT event, id, date, time,
lag(event) over (order by date, time) previous_event,
lead(event) over (order by date, time) next_event
)
WHERE event='ANOTHER'
AND ( event <> previous_event OR event <> next_event )
ORDER BY date, time
may be a little late :)
SELECT
event,
min(dt) as dt_begin, max(dt) as dt_end
FROM
(
select
t.*,
row_number()over(order by dt,rownum) -
row_number()over(partition by event order by dt,rownum) as group_id
from vvp_tmp t
--order by dt
)
GROUP BY group_id,event
HAVING 24*60*(max(dt)-min(dt))>=2
ORDER BY dt_begin
精彩评论