I find myself often wanting to write Python list comprehensions like this:
nearbyPoints = [(n, delta(n,x)) for n in allPoints if delta(n,x)<=radius]
That hopefully gives some context as to why I would want to do this, but there are also cases where multiple values need to be computed开发者_Go百科/compared per element:
newlist = [(x,f(x),g(f(x))) for x in bigList if f(x)<p and g(f(x))<q]
So I have two questions:
- will all those functions be evaluated multiple times or is the result cached? Does the language specify or is it implementation-specific? I'm using 2.6 now, but would 3.x be different?
- is there a neater way to write it? Sometimes f and g are long expressions and duplication is error prone and looks messy. I would really like to be able to write this:
newList = [(x,a=f(x),b=g(a)) for x in bigList if a<p and b<q]
but that doesn't work. Is there a good reason for not supporting this syntax? Can it be done via something like this? Or would I just have to use multiple listcomps or a for-loop?
Update: The walrus-operator :=
was introduced in Python 3.8, which assigns a variable, but also evaluates to the assigned value. As per @MartijnVanAttekum 's answer. I'd recommend waiting a year or so before using it in projects, because Python 3.6 and 3.7 is still quite mainstream, but it's a nicer solution that my alias suggestion below.
I have a hack
to create aliases inside list/dict comprehensions. You can use the for alias_name in [alias_value]
trick. For example you have this expensive function:
def expensive_function(x):
print("called the very expensive function, that will be $2")
return x*x + x
And some data:
data = [4, 7, 3, 7, 2, 3, 4, 7, 3, 1, 1 ,1]
And then you want to apply the expensive function over each element, and also filter based on it. What you do is:
result = [
(x, expensive)
for x in data
for expensive in [expensive_function(x)] #alias
if expensive > 3
]
print(result)
The second-for will only iterate over a list of size 1, effectively making it an alias. The output will show that the expensive function is called 12 times, exactly once for each data element. Nevertheless, the result of the function is used (at most) twice, once for the filter and once possible once for the output.
Please, always make sure to layout such comprehensions using multiple lines like I did, and append #alias to the line where the alias is. If you use an alias, the comprehension get's quite complicated, and you should help future readers of your code to get what you're doing. This is not perl, you know ;).
For completeness, the output:
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
called the very expensive function, that will be $2
[(4, 20), (7, 56), (3, 12), (7, 56), (2, 6), (3, 12), (4, 20), (7, 56), (3, 12)]
Code: http://ideone.com/7mUQUt
In regards to #1, yes, they will be evaluated multiple times.
In regards to #2, the way to do it is to calculate and filter in separate comprehensions:
Condensed version:
[(x,fx,gx) for (x,fx,gx) in ((x,fx,g(fx)) for (x,fx) in ((x,f(x)) for x in bigList) if fx < p) if gx<q]
Longer version expanded to make it easier to follow:
[(x,f,g) for (x,f,g) in
((x,f,g(f)) for (x,f) in
((x,f(x)) for x in bigList)
if f < p)
if g<q]
This will call f
and g
as few times as possible (values for each f(x)
is not < p
will never call g
, and f
will only be called once for each value in bigList
).
If you prefer, you can also get neater code by using intermediate variables:
a = ( (x,f(x)) for x in bigList )
b = ( (x,fx,g(fx)) for (x,fx) in a if fx<p )
results = [ c for c in b if c[2] < q ] # faster than writing out full tuples
a
and b
use generator expressions so that they don't have to actually instantiate lists, and are simply evaluated when necessary.
As list comprehensions become more complicated, they also start to become really hard to read. In such cases, it is often better to turn their internals into generator functions and give them a (hopefully) meaningful name.
# First example
def getNearbyPoints(x, radius, points):
"""Yields points where 'delta(x, point) <= radius'"""
for p in points:
distance = delta(p, x)
if distance <= radius:
yield p, distance
nearbyPoints = list(getNearbyPoints(x, radius, allPoints))
# Second example
def xfg(data, p, q):
"""Yield 3-tuples of x, f(x), g(f(x))"""
for x in data:
f = f(x)
if f < p:
g = g(f)
if g < q:
yield x, f, g
newList = list(xfg(bigList, p, q))
2021 update
- Using aliases is now possible with the Walrus operator (assignment expression) introduced in Python 3.8. For example using
difference
as an alias to what is calculated bydelta()
:
nearbyPoints = [(n, difference) for n in allPoints if (difference := delta(n,x)) <= radius]
Reference: PEP 572
If you invoke a function twice in an expression (including in a list comprehension), it will really be called twice. Python has no way of knowing if your function is a pure function or a procedural function. It invokes it when you tell it to, in this case, twice.
There's no way to assign to a variable in a list comprehension, because in Python, assignment is a statement, not an expression.
It sounds like you should use a full loop, not a list comprehension.
精彩评论