开发者

Time interval calculation in time series using SQL

开发者 https://www.devze.com 2023-02-12 13:11 出处:网络
I have a MySQL table like this CREATE TABLE IF NOT EXISTS `vals` ( `DT` datetime NOT NULL, `value` INT(11) NOT NULL,

I have a MySQL table like this

CREATE TABLE IF NOT EXISTS `vals` (
  `DT` datetime NOT NULL,
  `value` INT(11) NOT NULL,
  PRIMARY KEY (`DT`)
);

the DT is unique date with time

data sample:

INSERT INTO `vals` (`DT`,`value`) VALUES
('2011-02-05 06:05:00', 300),
('2011-02-05 11:05:00', 250),
('2011-02-05 14:35:00', 145),
('2011-02-05 16:45:00', 100),
('2011-02-05 18:50:00', 125),
('2011-02-05 19:25:00', 100),
('2011-02-05 21:10:00', 125),
('2011-02-06 00:30:00', 150);

I need to get something like this:

start|end|value
NULL,'2011-02-05 06:05:00',300
'2011-02-05 06:05:00','2011-02-05 11:05:00',250
'2011-02-05 11:05:00','2011-02-05 14:35:00',145
'2011-02-05 14:35:00','2011-02-05 16:45:00',100
'2011-02-05 16:45:00','2011-02-05 18:50:00',125
'2011-02-05 18:50:00','2011-02-05 19:25:00',100
'2011-02-05 19:25:00','2011-02-05 21:10:00',125
'2011-02-05 21:10:00','2011-02-06 00:30:00',150
'2011-02-06 00:30:00',NULL,NULL

I tried the following query:

SELECT T1.DT AS `start`,T2.DT AS `stop`, T2.value AS value FROM (
  SELECT DT FROM vals
) T1
LEFT JOIN (
  SELECT DT,value FROM  vals
) T2
ON T2.DT > T1.DT ORDER BY T1.DT ASC

but it returns to many rows (29 instead of 9) in result and I cold 开发者_StackOverflownot find any way to limit this using SQL. Is it Possible in MySQL?


Use a subquery

SELECT
  (
     select max(T1.DT)
     from vals T1
     where T1.DT < T2.DT
  ) AS `start`,
  T2.DT AS `stop`,
  T2.value AS value
FROM vals T2
ORDER BY T2.DT ASC

You can also use a MySQL specific solution employing variables

SELECT CAST( @dt AS DATETIME ) AS `start` , @dt := DT AS `stop` , `value` 
FROM (SELECT @dt := NULL) dt, vals
ORDER BY dt ASC

But you need to do it precisely

  • the ORDER by must be present otherwise the variables don't roll properly
  • the variable needs to be NULLified within the query using a subquery to set it, otherwise if you run it twice in a row, the 2nd time it will not start with NULL


You can use a server-side variable to simulate it:

select @myvar as start, end, value, @myvar := end as next_rows_start
from vals

Variables are interpreted from left-right in sequence, so the two references to @myvar (start and next_rows_start) will output with two different values.

Just remember to reset @myvar to null before and/or after the query, otherwise the second and subsequent runs will have a wrong first row:

select @myvar := null


This would be easier if the table had a running ID column which corresponds to the times in DT (same order). If you don't want to change the table you can use a temp:

drop table if exists temp;

CREATE TABLE temp (
  `id` INT(11) AUTO_INCREMENT,
  `DT` datetime NOT NULL,
  `value` INT(11) NOT NULL,
  PRIMARY KEY (`id`)
);

insert into temp (DT,value) select * from vals order by DT asc;

select t1.DT as `start`, t2.DT as `end`, t2.value 
from  temp t2
left join temp t1 ON t2.id = t1.id + 1;
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号