Understanding array indexing in R_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-01-16 18:28 出处：网络

The R documentation on arrays states The values in the data vector give the values in the array in the same order as they would occur in FORTRAN, that is \"column major order,\" with the first subsc

相关专题：arrays r

The R documentation on arrays states

The values in the data vector give the values in the array in the same order as they would occur in FORTRAN, that is "column major order," with the first subscript moving fastest and the last subscript slowest.

It then later gives a clarifying example of this by loading data into a two dimensional array:

 > x <- array(1:20, dim=c(4,5))   # Generate a 4 by 5 array.
 > x
      [,1] [,2] [,3] [,4] [,5]
 [1,]    1    5    9   13   17
 [2,]    2    6   10   14   18
 [3,]    3    7   11   15   19
 [4,]    4    8   12   16   20

From experience with other languages, I would think x[1, 2], rather than x[2, 1], would be 2, but this is pretty easy to adjust my thinking to. However, just as quickly as I perform my mental model shift, the next example smashes it apart:

 > i <- array(c(1:3,3:1), dim=c(3,2))
 > 开发者_如何学Goi                             # i is a 3 by 2 index array.
      [,1] [,2]
 [1,]    1    3
 [2,]    2    2
 [3,]    3    1
 > x[i]                          # Extract those elements
 [1] 9 6 3

So, I can see what happened here is that we extracted elements x[1, 3], x[2,2], and x[3, 1]. Okay, but doesn't this run completely counter to above claim of "column major order"?

From what I had understood, i should have been a 2 by 3 array, and R should have interpreted x[i] as x[i[1, 1], i[2, 1]], x[i[1, 2], i[2, 2]], .... However, what we observe is that, instead, R did x[i[1, 1], i[1, 2]], x[i[2, 1], i[2, 2]], ...

Is this a fundamental inconsistency in R, or have I completely misunderstood the documentation?

"column major order" only means that internally matrices are vectors ordered column by column; see that

x[1:20]

is 1, 2, 3, ..., 19, 20. Dimensions are ordered as usual in science -- first rows, then columns, then depth, then hyperdepth... The third example is tricky: if i has the same number of columns that x has dimensions, it is interpreted as vectorized selection. If not, both i and x are flattered to by-column vectors and simple vector indexation rules apply... for instance x[t(i)] is 1:20[c(1,3,2,2,3,1)].

It's a different thing altogether. As you note, the 2-column matrix is used as coordinates to get elements of x. It's an almost arbitrary design choice whether you use 2-column or 2-row matrices to do this, it's not related to the way matrices are filled.

The R guys chose 2-column.