Iterating over a numpy array_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-03-26 03:42 出处：网络

Is there a less verbose alternative to this: for x in xrange(array.shape[0]): for y in xrange(array.shape[1]):

相关专题：numpy python

Is there a less verbose alternative to this:

for x in xrange(array.shape[0]):
    for y in xrange(array.shape[1]):
        do_stuff(x, y)

I came up with this:

for x, y in itertools.product(map(xrange, array.shape)):
    do_stuff(x, y)

Which saves one indentation, but is still pretty ugly.

I'm hoping for something that looks like this pseudocode:

for x, y in array.indices:
    do_stuff(x, y)

Does anything like th开发者_StackOverflow社区at exist?

I think you're looking for the ndenumerate.

>>> a =numpy.array([[1,2],[3,4],[5,6]])
>>> for (x,y), value in numpy.ndenumerate(a):
...  print x,y
... 
0 0
0 1
1 0
1 1
2 0
2 1

Regarding the performance. It is a bit slower than a list comprehension.

X = np.zeros((100, 100, 100))

%timeit list([((i,j,k), X[i,j,k]) for i in range(X.shape[0]) for j in range(X.shape[1]) for k in range(X.shape[2])])
1 loop, best of 3: 376 ms per loop

%timeit list(np.ndenumerate(X))
1 loop, best of 3: 570 ms per loop

If you are worried about the performance you could optimise a bit further by looking at the implementation of ndenumerate, which does 2 things, converting to an array and looping. If you know you have an array, you can call the .coords attribute of the flat iterator.

a = X.flat
%timeit list([(a.coords, x) for x in a.flat])
1 loop, best of 3: 305 ms per loop

If you only need the indices, you could try numpy.ndindex:

>>> a = numpy.arange(9).reshape(3, 3)
>>> [(x, y) for x, y in numpy.ndindex(a.shape)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]

see nditer

import numpy as np
Y = np.array([3,4,5,6])
for y in np.nditer(Y, op_flags=['readwrite']):
    y += 3

Y == np.array([6, 7, 8, 9])

y = 3 would not work, use y *= 0 and y += 3 instead.

I see that no good desciption for using numpy.nditer() is here. So, I am gonna go with one. According to NumPy v1.21 dev0 manual, The iterator object nditer, introduced in NumPy 1.6, provides many flexible ways to visit all the elements of one or more arrays in a systematic fashion.

I have to calculate mean_squared_error and I have already calculate y_predicted and I have y_actual from the boston dataset, available with sklearn.

def cal_mse(y_actual, y_predicted):
    """ this function will return mean squared error
       args:
           y_actual (ndarray): np array containing target variable
           y_predicted (ndarray): np array containing predictions from DecisionTreeRegressor
       returns:
           mse (integer)
    """
    sq_error = 0
    for i in np.nditer(np.arange(y_pred.shape[0])):
        sq_error += (y_actual[i] - y_predicted[i])**2
    mse = 1/y_actual.shape[0] * sq_error
    
    return mse

Hope this helps :). for further explaination visit