开发者

R lag over missing data

开发者 https://www.devze.com 2022-12-10 09:21 出处:网络
Is there a variant of lag somewhere that keeps NAs in position? I want to compute returns of price data where data could be missing.

Is there a variant of lag somewhere that keeps NAs in position? I want to compute returns of price data where data could be missing.

Col 1 is the price data Col 2 is the lag of price Col 3 shows p - lag(p) - the return from 99 to 104 is effectively missed, so the path length of the computed returns will differ from the true. Col 4 shows the lag with NA position preserved Col 5 shows the new difference - now the return of 5 for 2009-11-07 is available

Cheers, Dave

x <- xts(c(100, 101, 97, 95, 99, NA, 104, 103, 103, 100), as.Date("2009-11-01") + 0:9)

# fake the lag I want, with NA kept in position
x.pos.lag <- lag.xts(x.pos)
x.pos.lag <- lag.xts(x.pos)
x.pos.lag['2009-11-07']=99
x.pos.lag['开发者_StackOverflow社区2009-11-06']=NA

cbind(x, lag.xts(x), x - lag.xts(x), x.pos.lag, x-x.pos.lag)
           ..1 ..2 ..3 ..4 ..5
2009-11-01 100  NA  NA  NA  NA
2009-11-02 101 100   1 100   1
2009-11-03  97 101  -4 101  -4
2009-11-04  95  97  -2  97  -2
2009-11-05  99  95   4  95   4
2009-11-06  NA  99  NA  NA  NA
2009-11-07 104  NA  NA  99   5
2009-11-08 103 104  -1 104  -1
2009-11-09 103 103   0 103   0
2009-11-10 100 103  -3 103  -3


There are no functions to do that natively in R, but you can create an index of the original NA positions and then swap the values there after the lag.

x <- xts(c(100, 101, 97, 95, 99, NA, 104, 103, 103, 100), as.Date("2009-11-01") + 0:9)
lag.xts.na <- function(x, ...) {
    na.idx <- which(is.na(x))
    x2 <- lag.xts(x, ...)
    x2[na.idx+1,] <- x2[na.idx,]
    x2[na.idx,] <- NA
    return(x2)
}

lag.xts.na(x)
           [,1]
2009-11-01   NA
2009-11-02  100
2009-11-03  101
2009-11-04   97
2009-11-05   95
2009-11-06   NA
2009-11-07   99
2009-11-08  104
2009-11-09  103
2009-11-10  103

Incidentally, are you just trying to deal with weekends/holidays or something along that line? If so, you might consider dropping those positions from your series; that will dramatically simplify things for you. Alternatively, the timeSeries package in Rmetrics has a number of functions to deal with business days.

0

精彩评论

暂无评论...
验证码 换一张
取 消