开发者

Scanning an array in R

开发者 https://www.devze.com 2023-01-14 18:16 出处:网络
I use R and I have a long numeric vector. I would like to look for all the maximal continuous subranges in this vector, where all values are lower then some threshold.

I use R and I have a long numeric vector. I would like to look for all the maximal continuous subranges in this vector, where all values are lower then some threshold.

For example, if the given vector is

5 5 6 6 7 5 4 4 4 3 2 1 1 1 2 3 4 5 6 7 6 5 开发者_开发问答4 3 2 2 3 4 4

and my threshold is 4 (i.e., =<3), then the values that meet this condition are marked with x:

0 0 0 0 0 0 0 0 0 x x x x x x x 0 0 0 0 0 0 0 x x x x 0 0

I would also like to return something like (10,16), (24,27). How do I do that?


To get the ranges you can use rle

First create the encoding

x <- c(5,5,6,6,7,5,4,4,4,3,2,1,1,1,2,3,4,5,6,7,6,5,4,3,2,2,3,4,4)
enc <- rle(x <= 3)

enc.endidx <- cumsum(enc$lengths) #ending indices
enc.startidx <- c(0, enc.endidx[1:(length(enc.endidx)-1)]) + 1 # starting indices

data.frame(startidx=enc.startidx[enc$values], endidx=enc.endidx[enc$values])

That should give you

  startidx endidx
1       10     16
2       24     27


The answer to your first question is pretty straight forward:

x <- c(5,5,6,6,7,5,4,4,4,3,2,1,1,1,2,3,4,5,6,7,6,5,4,3,2,2,3,4,4)
y <- x<=3

y
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE
[13]  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
[25]  TRUE  TRUE  TRUE FALSE FALSE

as.numeric(y)
[1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0

to get the indices as you want them is more difficult.
You can try which as proposed by whatnick.
Another possibility is to use match. It returns the first element that matches. So match(1,y) would return 10. match(0,y[10:length(y)]) - 1 would return 16. If you can put this into a while-loop you could get the indices as you like.


The operator you need is "which". The syntax will be indices<-which(vector<=3). This will give you a list of indices where the value meets the condition. To isolate transitions you may use a diffrential of the indices. Where the differential exceeds 1 you have a transition boundary.


I needed to do this too and this is what I'm using:

ranges <- function(b){ # b must be boolean
    b <- c(FALSE,b,FALSE)
    d <- b[-1]-b[-length(b)]
    return(data.frame(start=which(d==1),end=which(d==-1)-1))
}

In your example

x <- c(5,5,6,6,7,5,4,4,4,3,2,1,1,1,2,3,4,5,6,7,6,5,4,3,2,2,3,4,4)
ranges(x<=3)

produces

  start end
1    10  16
2    24  27
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号