开发者

What is the best way to share a value by all the generators created by a function?

开发者 https://www.devze.com 2023-04-04 04:29 出处:网络
Here I asked a question about izip_longest function from itertools module. The code of it: def izip_longest_from_docs(*args, **kwds):

Here I asked a question about izip_longest function from itertools module.

The code of it:

def izip_longest_from_docs(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')
    def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
        yield counter()         # yields the fillvalue, or r开发者_StackOverflow社区aises IndexError
    fillers = repeat(fillvalue)
    iters = [chain(it, sentinel(), fillers) for it in args]
    try:
        for tup in izip(*iters):
            yield tup
    except IndexError:
        pass

There appeared to be an error in the documentation in the pure Python equivalent of that function. The error was that the real function did and the abovementioned equivalent didn't propagate IndexError exceptions that were raised inside the generators sent as the function parameters.

@agf solved the problem and gave a corrected version of the pure Python equivalent.

But at the same time when he was writing his solution I made my own. And while making it I faced one problem which I hope will be unraveled by asking this question.

The code that I came up with is this:

def izip_longest_modified_my(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')

    class LongestExhausted(Exception):
        pass

    def sentinel(fillvalue = fillvalue, counter = [0]):
        def ret():
            counter[0] += 1
            if counter[0] == len(args):
                raise LongestExhausted
            yield fillvalue
        return ret()

    fillers = repeat(fillvalue)
    iters = [chain(it, sentinel(), fillers) for it in args]
    try:
        for tup in izip(*iters):
            yield tup
    except LongestExhausted:
        pass 

In the original code sentinel is a generator which implements lazy evaluation. So that counter() is returned only when it's actually needed by the iterator created using chain function.

In my code I added a counter which holds a list of one value [0]. The reason for that was to put a mutable object into some place where it can be accessed by all the returned iterators ret() and changed by them. The only place I found suitable was in the function_defaults of sentinel.

If I put it inside the sentinel function, then the counter would be assigned to [0] on every call of sentinel and that would be different lists for all the ret()s:

def sentinel(fillvalue = fillvalue):
    counter = [0]
    def ret():
        counter[0] += 1
        if counter[0] == len(args):
            raise LongestExhausted
        yield fillvalue
    return ret()

I tried to put it outside of the sentinel function:

counter = 0
def sentinel(fillvalue = fillvalue):
    def ret():
        counter += 1
        if counter == len(args):
            raise LongestExhausted
        yield fillvalue
    return ret()

But the exception rose: UnboundLocalError: local variable 'counter' referenced before assignment.

I added global keyword, but it didn't help (I think because counter is really not in the global scope):

counter = 0
def sentinel(fillvalue = fillvalue):
    global counter
    def ret():
        counter += 1
        if counter == len(args):
            raise LongestExhausted
        yield fillvalue
    return ret()

So, my question is:

Is the approach that I used (to put mutable list counter = [0] to function_defaults) the best in this case, or there is some better way to solve this problem?


This has been asked many times in many forms. Read any number of other questions about mutable default arguments and the new Python 3 nonlocal keyword. On Python 2, you could use a function attribute:

def sentinel(fillvalue = fillvalue):
    def ret():
        sentinel.counter += 1
        if sentinel.counter == len(args):
            raise LongestExhausted
        yield fillvalue
    return ret()
sentinel.counter = 0

or use global both inside ret and inside izip_longest so you're always referencing a global variable:

global counter
counter = 0
def sentinel(fillvalue = fillvalue):
    def ret():
        global counter
        counter += 1
        if counter == len(args):
            raise LongestExhausted
        yield fillvalue
    return ret()

However, using global restricts you to only one izip_longest at a time -- see the comments on the other answer.

You're also defining a new ret every time sentinel is called (once per iterator) -- you could instead do something like

global counter
counter = 0
arglen = len(args)

def ret():
    global counter
    counter += 1
    if counter == arglen:
        raise LongestExhausted
    return fillvalue

def sentinel():
    yield ret()

Example code for having sentinel outside izip_longest in re your question from the comments:

def sentinel(counter, arglen, fillvalue):
    def ret():
        counter[0] += 1
        if counter[0] == arglen:
            raise LongestExhausted
        yield fillvalue
    return ret()


def izip_longest_modified_my(*args, **kwds):
    # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
    fillvalue = kwds.get('fillvalue')

    class LongestExhausted(Exception):
        pass

    fillers = repeat(fillvalue)
    counter = [0]
    arglen = len(args)
    iters = [chain(it, sentinel(counter, arglen, fillvalue), fillers) for it in args]
    try:
        for tup in izip(*iters):
            yield tup
    except LongestExhausted:
        pass

Here you're again using a list just as a container to get around the problems accessing outer scopes in Python 2.


Using a global is a bad idea, IMHO. You need to make sure to reset the counter properly between calls. But more seriously this is a generator; you don't even need threading to have multiple calls to a generator in flight at the same time, which will wreck havoc with any attempt to sanely use a global to keep track of state.

You could just explicitly pass a reference to a mutable object into sentinel, and then into ret. It looks like your code controls all the calls to them. Function parameters are the original and boring way of transferring references between scopes!

0

精彩评论

暂无评论...
验证码 换一张
取 消