开发者

numpy: efficient execution of a complex reshape of an array

开发者 https://www.devze.com 2023-02-19 06:56 出处:网络
I am reading a vendor-provided large binary array into a 2D numpy array tempfid(M, N) # load data data=numpy.fromfile(file=dirname+\'/fid\', dtype=numpy.dtype(\'i4\'))

I am reading a vendor-provided large binary array into a 2D numpy array tempfid(M, N)

# load data
data=numpy.fromfile(file=dirname+'/fid', dtype=numpy.dtype('i4'))

# convert to complex data
fid=data[::2]+1j*data[1::2]

tempfid=fid.reshape(I*J*K, N)

and then I need to reshape it into a 4D array useful4d(N,I,J,K) using non-trivial mappings for the indices. I do this with a for loop along the following lines:

for idx in range(M):
    i=f1(idx) # f1, f2, and f3 are functions involving / and % as well as some lookups
    j=f2(idx)
    k=f3(idx)
    newfid[:,i,j,k] = tempfid[idx,:] #SLOW! CAN WE IMPROVE THIS?

Converting to complex takes 33% of the time while the copying of these slices M slices takes the remaining 66%. Calculating the indices is fast irrespective of whether I do this one by one in a loop as shown or by numpy.vectorizing the operation and applying it to an arange(M).

Is there a way to speed this up? Any help on more efficient slicing, copying (or not) etc appreciated.

EDIT: As learned in the开发者_运维问答 answer to question "What's the fastest way to convert an interleaved NumPy integer array to complex64?" the conversion to complex can be sped up by a factor of 6 if a view is used instead:

 fid = data.astype(numpy.float32).view(numpy.complex64)


idx = numpy.arange(M)
i = numpy.vectorize(f1)(idx)
j = numpy.vectorize(f2)(idx)
k = numpy.vectorize(f3)(idx)

# you can index arrays with other arrays
# that lets you specify this operation in one line.    
newfid[:, i,j,k] = tempfid.T

I've never used numpy's vectorize. Vectorize just means that numpy will call your python function multiple times. In order to get speed, you need use array operations like the one I showed here and you used to get complex numbers.

EDIT

The problem is that the dimension of size 128 was first in newfid, but last in tempfid. This is easily by using .T which takes the transpose.


How about this. Set us your indicies using the vectorized versions of f1,f2,f3 (not necessarily using np.vectorize, but perhaps just writing a function that takes an array and returns an array), then use np.ix_:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.ix_.html

to get the index arrays. Then reshape tempfid to the same shape as newfid and then use the results of np.ix_ to set the values. For example:

tempfid = np.arange(10)
i = f1(idx) # i = [4,3,2,1,0]
j = f2(idx) # j = [1,0]
ii = np.ix_(i,j)
newfid = tempfid.reshape((5,2))[ii]

This maps the elements of tempfid onto a new shape with a different ordering.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号