开发者

Converting irregular timestamp data to regularly spaced data using R

开发者 https://www.devze.com 2023-02-17 09:15 出处:网络
In a database I have data with associated timestamps. The timestamp is random in nature and has resolution upto minutes. I 开发者_JAVA百科want to make this data uniform using R with respect to timesta

In a database I have data with associated timestamps. The timestamp is random in nature and has resolution upto minutes. I 开发者_JAVA百科want to make this data uniform using R with respect to timestamps (with seconds resolution) with NA replaced by the previous value. Also, every timestamp should contain data for all the symbols. I have tried some timeseries packages for making the data uniform but have not been succcessful.

This is the code I have run so far

library("RPostgreSQL")  
library(DBI)  
library(sqldf)  
drv <- dbDriver("PostgreSQL")  
ch <- dbConnect(drv, dbname="derivativesData", 
                     user="postgres", password="postgres")  
companyFrame <- dbGetQuery(ch, "select * from derData")  
companyFrame$trade_time  
  [1] "2011-06-01 09:00:00 IST" "2011-06-01 09:00:00 IST"  
  [3] "2011-06-01 09:00:00 IST" "2011-06-01 09:00:00 IST"  
  [5] "2011-06-01 09:00:00 IST" "2011-06-01 09:00:00 IST"  
  [7] "2011-06-01 09:00:00 IST" "2011-06-01 09:00:00 IST"  
  [9] "2011-06-01 09:00:00 IST" "2011-06-01 09:01:00 IST"  
 [11] "2011-06-01 09:01:00 IST" "2011-06-01 09:01:00 IST"  
 [13] "2011-06-01 09:02:00 IST" "2011-06-01 09:02:00 IST"  
 [15] "2011-06-01 09:02:00 IST" "2011-06-01 09:03:00 IST"  
 [17] "2011-06-01 09:04:00 IST" "2011-06-01 09:04:00 IST"  
 [19] "2011-06-01 09:05:00 IST" "2011-06-01 09:05:00 IST"  
 [21] "2011-06-01 09:06:00 IST" "2011-06-01 09:06:00 IST"  
 [23] "2011-06-01 09:06:00 IST" "2011-06-01 09:07:00 IST"  
 [25] "2011-06-01 09:08:00 IST" "2011-06-01 09:09:00 IST"  
 [27] "2011-06-01 09:10:00 IST" "2011-06-01 09:10:00 IST"  

I want to convert this data into uniform format with say 10secs resolution.


Here I will use a 10 minutes resolution as your times don't have any seconds...

With the following sample data :

R> time <- c("2011-06-01 09:00:00 IST", "2011-06-01 09:00:00 IST", "2011-06-01 09:01:00 IST", 
+            "2011-06-01 09:06:00 IST", "2011-06-01 09:10:00 IST", "2011-06-01 09:15:00 IST")

You can first convert the strings to a POSIXlt date format :

R> time2 <- strptime(time, format="%Y-%m-%d %X")
R> time2
[1] "2011-06-01 09:00:00" "2011-06-01 09:00:00" "2011-06-01 09:01:00"
[4] "2011-06-01 09:06:00" "2011-06-01 09:10:00" "2011-06-01 09:15:00"

Then you could use the minute function from the lubridate package to alter the minute components of your date and round it to a 10 minutes resolution, for example :

R> library(lubridate)
R> minute(time2) <- minute(time2) %/% 10 * 10
R> time2
[1] "2011-06-01 09:00:00 CEST" "2011-06-01 09:00:00 CEST"
[3] "2011-06-01 09:00:00 CEST" "2011-06-01 09:00:00 CEST"
[5] "2011-06-01 09:10:00 CEST" "2011-06-01 09:10:00 CEST"


Try the data.table package and it's roll=TRUE feature. See ?data.table and the vignettes where it talks about fast last observation carried forward.

0

精彩评论

暂无评论...
验证码 换一张
取 消