开发者

Apply ksmooth to time series

开发者 https://www.devze.com 2023-01-27 08:55 出处:网络
I have the following problem: I have a data frame \"test\" that looks more or less like this: Datereturnpricevol

I have the following problem:

I have a data frame "test" that looks more or less like this:

Date         return     price      vol   
20100902     0.3        15         8.5
20100902     0.4        17         8.6
20100902     0.6        19         8.7
.....
20100903     0.2        13         8.2
20100903     0.4        17         8.6
20100903     0.8        21         9.0
.....

So I have given values for each date (10 per day). What I would like to do now is apply ksmooth() on each date, so e.g. ksmooth(return, price, n.points = 50) for each date. This should give me 50 observations for each date. In addition, I would like a time stamp for the interpolated values. So the resulting frame should like

Date         return     price         
20100620     0.3        15  
20100620     0.31       15.2
20100620     0.32       15.3 
20100620     0.4        17         
20100620     0.6        19        
.....
开发者_JS百科20100621     0.2        13     
20100621     0.21       13.1
20100621     0.22       13.2
20100621     0.4        17         
20100621     0.8        21     
etc.

with 50 observations per day. So here is what I'm looking for: take the first 10 observations (e.g. date 1 = 20102006, interpolate and put a time stamp on the interpolated values (20100620). Then, take second 10 observations (date = 20100621), interpolate and put a time stamp on the interpolated values (20100621) and so on.

I'm quite new to R, but this is what I tried. I thought of using the zoo() function to that. Before implementing anything, I wanted to make my date entries unique, so I just added hours to each entry

test <- read.zoo("test.txt", format = "%Y%m%d")
test <- zoo(test, as.POSIXct(time(test)) + 1:26)

There probably is something wrong with that, because R complained. Then I thought of using the rollapply() function.

roll.test <- rollapply(test, 10, FUN = function(x,y) ksmooth(test$return,    
+ test$price, "normal", bandwidth = 20, n.points = 50) )

Unfortunately the result is very confusing. And the by.column = FALSE demand does not work.

I would very appreciate some help. It does not have to build upon my "trial version" at all. Thank you very much Dani

My data looks like this:

"date" "days" "return" "price" 
"66" 20100620 91 0.18 1389.373 
"67" 20100620 91 0.19 1370.57 
"68" 20100620 91 0.19 1353.122 
"69" 20100620 91 0.19 1336.291 
"70" 20100620 91 0.20 1319.774 
"71" 20100620 91 0.20 1303.341 
"72" 20100620 91 0.21 1286.656 
"326" 20100621 91 0.18 1386.28 
"327" 20100621 91 0.18 1367.694 
"328" 20100621 91 0.19 1350.375 
"329" 20100621 91 0.19 1333.615 
"330" 20100621 91 0.20 1317.164 
"331" 20100621 91 0.20 1300.783 
"332" 20100621 91 0.21 1284.113 


Problem is that the ksmooth function will return a list, and those lists are saved as that by rollaplly. By the way, I don't think you even want to use rollaplly, as that does not do this for each date but "rolls" over the dataframe. I believe from your explanation that is not the desired behaviour.

I couldn't really work it out using a zoo object, as that one is quite restrictive. Maybe somebody else will show you that. You can construct that dataframe using the ddply function from the plyr package :

tt <- ddply(test,.(Date),
  function(x) { 
       as.data.frame(ksmooth(x$return,x$price,"normal",bandwidth=2,n.points=50))
  })

tt can then be transformed to a zoo object, using

tt2 <- zoo(tt, as.POSIXct(tt$Date) + 1:50)

Alternatively, you could do it by hand using a bit of list manipulation. again, the resulting tt can be converted by the line above to a zoo object.

tt <- split(test,test$Date)

tt <- lapply(tt,function(x){
        as.data.frame(ksmooth(x$return,x$price,"normal",bandwidth=2,n.points=50))
      })

tt <- do.call(rbind,tt)
names(tt) <- c("return","price")
tt$Date <- as.Date(gsub("\\.\\d+","",rownames(tt)))

Mind you, I used read.table() to construct test :

zz <- textConnection(
"Date    ,     return ,    price  ,    vol
20100902 ,    0.3  ,      15   ,      8.5
20100902 ,    0.4  ,      17   ,      8.6
20100902 ,    0.6  ,      19   ,      8.7
20100903 ,    0.2  ,      13   ,      8.2
20100903 ,    0.4  ,      17   ,      8.6
20100903 ,    0.8  ,      21   ,      9.0"
)
test <- read.table(zz,header=T,sep=",")
test$Date <- as.Date(as.character(test$Date),format="%Y%m%d")
close(zz)
0

精彩评论

暂无评论...
验证码 换一张
取 消