Here I asked a question about izip_longest
function from itertools
module.
The code of it:
def izip_longest_from_docs(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
yield counter() # yields the fillvalue, or r开发者_StackOverflow社区aises IndexError
fillers = repeat(fillvalue)
iters = [chain(it, sentinel(), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except IndexError:
pass
There appeared to be an error in the documentation in the pure Python equivalent of that function. The error was that the real function did and the abovementioned equivalent didn't
propagate IndexError
exceptions that were raised inside the generators sent as the function parameters.
@agf solved the problem and gave a corrected version of the pure Python equivalent.
But at the same time when he was writing his solution I made my own. And while making it I faced one problem which I hope will be unraveled by asking this question.
The code that I came up with is this:
def izip_longest_modified_my(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
class LongestExhausted(Exception):
pass
def sentinel(fillvalue = fillvalue, counter = [0]):
def ret():
counter[0] += 1
if counter[0] == len(args):
raise LongestExhausted
yield fillvalue
return ret()
fillers = repeat(fillvalue)
iters = [chain(it, sentinel(), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except LongestExhausted:
pass
In the original code sentinel
is a generator which implements lazy evaluation. So that counter()
is returned only when it's actually needed by the iterator created using chain
function.
In my code I added a counter
which holds a list of one value [0]
. The reason for that was to put a mutable
object into some place where it can be accessed by all the returned iterators ret()
and changed by them. The only place I found suitable was in the function_defaults
of sentinel
.
If I put it inside the sentinel
function, then the counter
would be assigned to [0]
on every call of sentinel
and that would be different lists for all the ret()
s:
def sentinel(fillvalue = fillvalue):
counter = [0]
def ret():
counter[0] += 1
if counter[0] == len(args):
raise LongestExhausted
yield fillvalue
return ret()
I tried to put it outside of the sentinel
function:
counter = 0
def sentinel(fillvalue = fillvalue):
def ret():
counter += 1
if counter == len(args):
raise LongestExhausted
yield fillvalue
return ret()
But the exception rose: UnboundLocalError: local variable 'counter' referenced before assignment
.
I added global
keyword, but it didn't help (I think because counter
is really not in the global
scope):
counter = 0
def sentinel(fillvalue = fillvalue):
global counter
def ret():
counter += 1
if counter == len(args):
raise LongestExhausted
yield fillvalue
return ret()
So, my question is:
Is the approach that I used (to put mutable
list counter = [0]
to function_defaults
) the best in this case, or there is some better way to solve this problem?
This has been asked many times in many forms. Read any number of other questions about mutable default arguments and the new Python 3 nonlocal
keyword. On Python 2, you could use a function attribute:
def sentinel(fillvalue = fillvalue):
def ret():
sentinel.counter += 1
if sentinel.counter == len(args):
raise LongestExhausted
yield fillvalue
return ret()
sentinel.counter = 0
or use global
both inside ret
and inside izip_longest
so you're always referencing a global variable:
global counter
counter = 0
def sentinel(fillvalue = fillvalue):
def ret():
global counter
counter += 1
if counter == len(args):
raise LongestExhausted
yield fillvalue
return ret()
However, using global
restricts you to only one izip_longest
at a time -- see the comments on the other answer.
You're also defining a new ret
every time sentinel
is called (once per iterator) -- you could instead do something like
global counter
counter = 0
arglen = len(args)
def ret():
global counter
counter += 1
if counter == arglen:
raise LongestExhausted
return fillvalue
def sentinel():
yield ret()
Example code for having sentinel
outside izip_longest
in re your question from the comments:
def sentinel(counter, arglen, fillvalue):
def ret():
counter[0] += 1
if counter[0] == arglen:
raise LongestExhausted
yield fillvalue
return ret()
def izip_longest_modified_my(*args, **kwds):
# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
fillvalue = kwds.get('fillvalue')
class LongestExhausted(Exception):
pass
fillers = repeat(fillvalue)
counter = [0]
arglen = len(args)
iters = [chain(it, sentinel(counter, arglen, fillvalue), fillers) for it in args]
try:
for tup in izip(*iters):
yield tup
except LongestExhausted:
pass
Here you're again using a list just as a container to get around the problems accessing outer scopes in Python 2.
Using a global is a bad idea, IMHO. You need to make sure to reset the counter properly between calls. But more seriously this is a generator; you don't even need threading to have multiple calls to a generator in flight at the same time, which will wreck havoc with any attempt to sanely use a global to keep track of state.
You could just explicitly pass a reference to a mutable object into sentinel, and then into ret. It looks like your code controls all the calls to them. Function parameters are the original and boring way of transferring references between scopes!
精彩评论