开发者

Using numpy.apply

开发者 https://www.devze.com 2022-12-24 18:36 出处:网络
What\'s wrong with this snippet of code? import numpy as np from scipy import stats d = np.arange(10.0) cutoffs = [stats.scoreatpercentile(d, pct) for pct in range(0, 100, 20)]

What's wrong with this snippet of code?

import numpy as np
from scipy import stats

d = np.arange(10.0)
cutoffs = [stats.scoreatpercentile(d, pct) for pct in range(0, 100, 20)]
f = lambda x: np.sum(x > cutoffs)
fv = np.vectorize(f)

# why don't these two lines output the same values?
[f(x) for x in d] # => [0, 1, 2, 2, 3, 3, 4, 4, 5, 5]
fv(d)             # =>开发者_如何学Python; array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Any ideas?


cutoffs is a list. The numbers you extract from d are all turned into float and applied using numpy.vectorize. (It's actually rather odd—it looks like first it tries numpy floats that work like you want then it tries normal Python floats.) By a rather odd, stupid behavior in Python, floats are always less than lists, so instead of getting things like

>>> # Here is a vectorized array operation, like you get from numpy. It won't
>>> # happen if you just use a float and a list.
>>> 2.0 > [0.0, 1.8, 3.6, 5.4, 7.2]
[True, True, False, False, False] # not real

you get

>>> # This is an actual copy-paste from a Python interpreter
>>> 2.0 > [0.0, 1.8, 3.6, 5.4, 7.2]
False

To solve the problem, you can make cutoffs a numpy array instead of a list. (You could probably also move the comparison into numpy operations entirely instead of faking it with numpy.vectorize, but I do not know offhand.)

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号