开发者

How to transform negative elements to zero without a loop?

开发者 https://www.devze.com 2023-01-09 18:29 出处:网络
If I have an array like a = np.array([2, 3, -1, -4, 3]) I want to set all the negative elements to zero: [2, 3, 0, 0, 3]. How to do it with numpy without an explicit for? I need to use the modified

If I have an array like

a = np.array([2, 3, -1, -4, 3])

I want to set all the negative elements to zero: [2, 3, 0, 0, 3]. How to do it with numpy without an explicit for? I need to use the modified a in a computation, for example

c = a * b

where b is another array with the same length of the original a

Conclusion

import num开发者_开发技巧py as np
from time import time

a = np.random.uniform(-1, 1, 20000000)
t = time(); b = np.where(a>0, a, 0); print ("1. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); b = a.clip(min=0); print ("2. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); a[a < 0] = 0; print ("3. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); a[np.where(a<0)] = 0; print ("4. ", time() - t)
a = np.random.uniform(-1, 1, 20000000)
t = time(); b = [max(x, 0) for x in a]; print ("5. ", time() - t)
  1. 1.38629984856
  2. 0.516846179962 <- faster a.clip(min=0);
  3. 0.615426063538
  4. 0.944557905197
  5. 51.7364809513


a = a.clip(min=0)


I would do this:

a[a < 0] = 0

If you want to keep the original a and only set the negative elements to zero in a copy, you can copy the array first:

c = a.copy()
c[c < 0] = 0


Another trick is to use multiplication. This actually seems to be much faster than every other method here. For example

b = a*(a>0) # copies data

or

a *= (a>0) # in-place zero-ing

I ran tests with timeit, pre-calculating the the < and > because some of these modify in-place and that would greatly effect results. In all cases a was np.random.uniform(-1, 1, 20000000) but with negatives already set to 0 but L = a < 0 and G = a > 0 before a was changed. The clip is relatively negatively impacted since it doesn't get to use L or G (however calculating those on the same machine took only 17ms each, so it is not the major cause of speed difference).

%timeit b = np.where(G, a, 0)  # 132ms  copies
%timeit b = a.clip(min=0)      # 165ms  copies
%timeit a[L] = 0               # 158ms  in-place
%timeit a[np.where(L)] = 0     # 122ms  in-place
%timeit b = a*G                # 87.4ms copies
%timeit np.multiply(a,G,a)     # 40.1ms in-place (normal code would use `a*=G`)

When choosing to penalize the in-place methods instead of clip, the following timings come up:

%timeit b = np.where(a>0, a, 0)             # 152ms
%timeit b = a.clip(min=0)                   # 165ms
%timeit b = a.copy(); b[a<0] = 0            # 231ms
%timeit b = a.copy(); b[np.where(a<0)] = 0  # 205ms
%timeit b = a*(a>0)                         # 108ms
%timeit b = a.copy(); b*=a>0                # 121ms

Non in-place methods are penalized by 20ms (the time required to calculate a>0 or a<0) and the in-place methods are penalize 73-83 ms (so it takes about 53-63ms to do b.copy()).

Overall the multiplication methods are much faster than clip. If not in-place, it is 1.5x faster. If you can do it in-place then it is 2.75x faster.


Use where

a[numpy.where(a<0)] = 0


Based on my answer here, using np.maximum is the fastest possible way.

a = np.random.random(1000) - 0.5

%%timeit
a_ = a.copy()
a_ = np.maximum(a_,0)
# 15.6 µs ± 2.14 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%%timeit
a_ = a.copy()
a_ = a_.clip(min=0)
# 54.2 µs ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


And just for the sake of comprehensiveness, I would like to add the use of the Heaviside function (or a step function) to achieve a similar outcome as follows:

Let say for continuity we have

a = np.array([2, 3, -1, -4, 3])

Then using a step function np.heaviside() one can try

b = a * np.heaviside(a, 0)

Note something interesting in this operation because the negative signs are preserved! Not very ideal for most situations I would say.

This can then be corrected for by

b = abs(b)

So this is probably a rather long way to do it without invoking some loop.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号