I have the following data set which shows the start and the end of an episode (date and time)
ep <- data.frame(start=c("2009-07-13 23:45:00", "2009-08-14 08:30:00",
"2009-09-16 15:30:00"),
end=c("2009-07-14 00:03:00", "2009-08-15 08:35:00",
"2009-09-19 07:30:00"))
I need to convert it into a data frame which would show in each calendar day how many minutes of episodes there were. For the above example it would be:
2009-07-13 15
2009-07-14 3
2009-08-14 930
2009-08-15 515
2009-09-1开发者_如何学编程6 510
2009-09-17 1440
2009-09-18 1440
2009-09-19 450
I appreciate any help
This works, but seems slightly inelegant. First, create a vector that is a sequence of times between each start and end time by minutes:
tmp <- do.call(c, apply(ep, 1,
function(x) head(seq(from = as.POSIXct(x[1]),
to = as.POSIXct(x[2]),by = "mins"),
-1)))
We use head(...., -1)
to remove the last minute from each sequence so as the minutes match what you wanted.
Next, split this vector into minutes occurring on individual days, and count how many minuts there are per day:
tmp <- sapply(split(tmp, format(tmp, format = "%Y-%m-%d")), length)
Note that for some reason (probably time-zone related) that we can't just use as.Date(tmp)
to get a vector of dates, we need to explicitly format the times to show only the date parts.
The final step is to arrange the tmp
object that contains everything we need into the format you requested:
mins <- data.frame(Date = names(tmp), Minutes = tmp, row.names = NULL)
This gives:
> mins
Date Minutes
1 2009-07-13 15
2 2009-07-14 3
3 2009-08-14 930
4 2009-08-15 515
5 2009-09-16 510
6 2009-09-17 1440
7 2009-09-18 1440
8 2009-09-19 450
精彩评论