In-place type conversion of a NumPy array_问答_开发者

Given a NumPy array of int32, how do I convert it to float32 in place? So basically, I would like to do

a = a.astype(numpy.float32)

without copying the array. It is big.

The reason for doing this is that I have two algorithms for the computation of a. One of them returns an array of int32, the other returns an array of float32 (and this is inherent to the two different algorithms). All further computations assume that a is an array of float32.

Currently I do the conversion in a C function called via ctypes. Is there a way to do this in Python开发者_如何学C?

Update: This function only avoids copy if it can, hence this is not the correct answer for this question. unutbu's answer is the right one.

a = a.astype(numpy.float32, copy=False)

numpy astype has a copy flag. Why shouldn't we use it ?

You can make a view with a different dtype, and then copy in-place into the view:

import numpy as np
x = np.arange(10, dtype='int32')
y = x.view('float32')
y[:] = x

print(y)

yields

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)

To show the conversion was in-place, note that copying from x to y altered x:

print(x)

prints

array([         0, 1065353216, 1073741824, 1077936128, 1082130432,
       1084227584, 1086324736, 1088421888, 1090519040, 1091567616])

You can change the array type without converting like this:

a.dtype = numpy.float32

but first you have to change all the integers to something that will be interpreted as the corresponding float. A very slow way to do this would be to use python's struct module like this:

def toi(i):
    return struct.unpack('i',struct.pack('f',float(i)))[0]

...applied to each member of your array.

But perhaps a faster way would be to utilize numpy's ctypeslib tools (which I am unfamiliar with)

- edit -

Since ctypeslib doesnt seem to work, then I would proceed with the conversion with the typical numpy.astype method, but proceed in block sizes that are within your memory limits:

a[0:10000] = a[0:10000].astype('float32').view('int32')

...then change the dtype when done.

Here is a function that accomplishes the task for any compatible dtypes (only works for dtypes with same-sized items) and handles arbitrarily-shaped arrays with user-control over block size:

import numpy

def astype_inplace(a, dtype, blocksize=10000):
    oldtype = a.dtype
    newtype = numpy.dtype(dtype)
    assert oldtype.itemsize is newtype.itemsize
    for idx in xrange(0, a.size, blocksize):
        a.flat[idx:idx + blocksize] = \
            a.flat[idx:idx + blocksize].astype(newtype).view(oldtype)
    a.dtype = newtype

a = numpy.random.randint(100,size=100).reshape((10,10))
print a
astype_inplace(a, 'float32')
print a

Time spent reading data

t1=time.time() ; V=np.load ('udata.npy');t2=time.time()-t1 ; print( t2 )

95.7923333644867

V.dtype

dtype('>f8')

V.shape

(3072, 1024, 4096)

**Creating new array **

t1=time.time() ; V64=np.array( V, dtype=np.double); t2=time.time()-t1 ; print( t2 )

1291.669689655304

Simple in-place numpy conversion

t1=time.time() ; V64=np.array( V, dtype=np.double); t2=time.time()-t1 ; print( t2 )

205.64322113990784

Using astype

t1=time.time() ; V = V.astype(np.double) ; t2=time.time()-t1 ; print( t2 )

400.6731758117676

Using view

t1=time.time() ; x=V.view(np.double);V[:,:,:]=x ;t2=time.time()-t1 ; print( t2 )

556.5982494354248

Note that each time I cleared the variables. Thus simply let python handle the conversion is the most efficient.

import numpy as np
arr_float = np.arange(10, dtype=np.float32)
arr_int = arr_float.view(np.float32)

use view() and parameter 'dtype' to change the array in place.

Use this:

In [105]: a
Out[105]: 
array([[15, 30, 88, 31, 33],
       [53, 38, 54, 47, 56],
       [67,  2, 74, 10, 16],
       [86, 33, 15, 51, 32],
       [32, 47, 76, 15, 81]], dtype=int32)

In [106]: float32(a)
Out[106]: 
array([[ 15.,  30.,  88.,  31.,  33.],
       [ 53.,  38.,  54.,  47.,  56.],
       [ 67.,   2.,  74.,  10.,  16.],
       [ 86.,  33.,  15.,  51.,  32.],
       [ 32.,  47.,  76.,  15.,  81.]], dtype=float32)

a = np.subtract(a, 0., dtype=np.float32)