开发者

Getting the row index for a 2D numPy array when multiple column values are known

开发者 https://www.devze.com 2023-02-08 14:37 出处:网络
Suppose I have a 2D numPy array such as: a = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] How to I find the index of the row for which I know multiple values? For example, if it is known that the 0th colu

Suppose I have a 2D numPy array such as:

a = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ]

How to I find the index of the row for which I know multiple values? For example, if it is known that the 0th column is 2 and the 1st column is 5, I would like to know the row index where this condition is met (row 1 in this case).

In my application, the first two columns are (x,y) coordinates, and the third column is information about that coordinate. I am trying to find particular coordinates in a list so I can change the value in the third column.

EDIT: To clarify, here is a non-square example:

a = [ [1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11开发者_StackOverflow, 12], [13, 14, 15, 16, 17, 18] ]

Suppose I know the row I am looking for has 13 in the 0th column, and 14 in the 1st column. I would like to return the index of that row. In this case, I would like to return the index 2 (2nd row).

Or better yet, I would like to edit the 4th column of the row that has 13 in the 0th column and 14 in the 1st column. Here is a solution I found to the case I have described (changing the value to 999):

a[(a[:,0]==13) & (a[:,1]==14), 3] = 999

gives:

a = [ [1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12], [13, 14, 15, 999, 17, 18] ]

I'm sorry if this was unclear. Could someone point out in my original post (above the edit) how this could be interpreted differently, because I am having trouble seeing it.

Thanks.

EDIT 2: Fixed mistake in first edit (shown in bold)

I can now see how I made this whole thing confusing for everyone. The solution to my problem is well described in condition b) of eat's solution. Thank you.


In [80]: a = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])
In [81]: a
Out[81]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

a==2 returns a boolean numpy array, showing where the condition is True:

In [82]: a==2
Out[82]: 
array([[False,  True, False],
       [False, False, False],
       [False, False, False]], dtype=bool)

You can find any columns where this is True by using np.any(...,axis=0):

In [83]: np.any(a==2,axis=0)
Out[83]: array([False,  True, False], dtype=bool)

In [84]: np.any(a==5,axis=0)
Out[84]: array([False,  True, False], dtype=bool)

You can find where both conditions are simultaneously true by using &:

In [85]: np.any(a==2,axis=0) & np.any(a==5,axis=0)
Out[85]: array([False,  True, False], dtype=bool)

Finally, you can find the index of the columns where the conditions are simultaneously True using np.where:

In [86]: np.where(np.any(a==2,axis=0) & np.any(a==5,axis=0))
Out[86]: (array([1]),)


Here are ways to handle conditions on columns or rows, inspired by the Zen of Python.

In []: import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
...

So following the second advice:
a) conditions on column(s), applied to row(s):

In []: a= arange(12).reshape(3, 4)
In []: a
Out[]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In []: a[2, logical_and(1== a[0, :], 5== a[1, :])]+= 12
In []: a
Out[]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8, 21, 10, 11]])

b) conditions on row(s), applied to column(s):

In []: a= a.T
In []: a
Out[]:
array([[ 0,  4,  8],
       [ 1,  5, 21],
       [ 2,  6, 10],
       [ 3,  7, 11]])
In []: a[logical_and(1== a[:, 0], 5== a[:, 1]), 2]+= 12
In []: a
Out[]:
array([[ 0,  4,  8],
       [ 1,  5, 33],
       [ 2,  6, 10],
       [ 3,  7, 11]])

So I hope this really makes sense to allways be explicit when accessing columns and rows. Code is typically read by people with various backgrounds.


Doing

np.where(np.any(a==2,axis=0) & np.any(a==5,axis=0))

as unutbu suggested will not use the information that 2 is in the 0th column, and 5 is in the 1st. So, for a = np.array([[5, 2, 3], [2, 5, 6], [7, 8, 9]]), it will mistakenly return (array([0, 1]),)

Instead, you can use

np.where((a[0]==2) & (a[1]==5))

to get the correct result (array([1]),).

Furthermore, if you want to edit the 2nd column of that particular row, you can skip the np.where and just reference it with: a[2][(a[0]==2) & (a[1]==5)]. This will work also for assignments, for example a[2][(a[0]==2) & (a[1]==5)] = 11.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号