开发者

python izip which cycles through all iterables until the longest finishes

开发者 https://www.devze.com 2023-01-24 04:52 出处:网络
This turned out not to be a trivial task for me and I couldn\'t find any receipt so maybe you can point me to one or you have a ready, proper and well-tuned solution for that? Proper meaning works als

This turned out not to be a trivial task for me and I couldn't find any receipt so maybe you can point me to one or you have a ready, proper and well-tuned solution for that? Proper meaning works also for iterators that do not know own length (without __len__) and works for exhaustible iterators (e.g. chained iterators); well-tuned meaning fast.

Note: in place solution is not possible due to necessity to cache iterators outputs to re-iterate them (Glenn Maynard pointed that out).

Example usage:

>>> list(izip_cycle(range(2), range(5), range(3)))
[(0, 0, 0), (1, 1, 1), (0, 2, 2), (1, 3, 0), (0, 4, 1)]
>>> from iterto开发者_JAVA技巧ols import islice, cycle, chain
>>> list(islice(izip_cycle(cycle(range(1)), chain(range(1), range(2))), 6))
[(0, 0), (0, 0), (0, 1), (0, 0), (0, 0), (0, 1)]


Here is something inspired by itertools.tee and itertools.cycle. It works for any kind of iterable:

class izip_cycle(object):
    def __init__(self, *iterables ):
        self.remains = len(iterables)
        self.items = izip(*[self._gen(it) for it in iterables])

    def __iter__(self):
        return self.items

    def _gen(self, src):
        q = []
        for item in src:
            yield item
            q.append(item)

        # done with this src
        self.remains -=1
        # if there are any other sources then cycle this one
        # the last souce remaining stops here and thus stops the izip
        if self.remains:
            while True:
                for item in q:
                    yield item


A simple approach which might work for you, depending on your requirement is:

import itertools

def izip_cycle(*colls):
    maxlen = max(len(c) if hasattr(c,'__len__') else 0 for c in colls)
    g = itertools.izip(*[itertools.cycle(c) for c in colls])

    for _ in range(maxlen):
        yield g.next()

The first thing this does it find the length of longest sequence so it knows how many times to repeat. Sequences without __len__ are counted as having 0 length; this might bewhat you want - if you have an unending sequence you probably want to repeat over the finite sequences. Although this doesn't handle finite iterators with no length.

Never we use itertools.cycle to create a cycling version of each iterator and then use itertools.zip to zip them together.

Finally we yield each entry from our zip until we've given our desired number of results.

If you want this to work for finite iterator with no len we need to do more of the work ourselves:

def izip_cycle(*colls):
    iters = [iter(c) for c in colls]
    count = len(colls)
    saved = [[] for i in range(count)]
    exhausted = [False] * count

    while True:
        r = []

        for i in range(count):
            if not exhausted[i]:
                try:
                    n = iters[i].next()
                    saved[i].append(n)
                    r.append(n)
                except StopIteration:
                    exhausted[i] = True
                    if all(exhausted):
                        return
                    saved[i] = itertools.cycle(saved[i])
            if exhausted[i]:
                r.append(saved[i].next())

        yield r

This is basically an extension of the Python implementation of itertools.cycle in the documentation to run over multiple sequences. We savd up items we've seen in saved to repeat and track which sequences have run out in exhausted.

As this version waits for all the sequences to run out, if you pass in something infinite the cycling will run on forever.


def izip_cycle_inplace(*iterables):
    def wrap(it):
        empty = True
        for x in it: empty = yield x
        if empty: return
        next(counter)
        while True:
            empty = True
            for x in it: empty = yield x
            if empty: raise ValueError('cannot cycle iterator in-place')
    iterators = [wrap(i) for i in iterables]
    counter = iter(iterators)
    next(counter)
    while True:
        yield [next(i) for i in iterators]

def izip_cycle(*iterables):
    def wrap(it):
        elements = []
        for x in it:
            yield x
            elements.append(x)
        if not elements: return
        next(counter)
        while True:
            for x in elements: yield x
    iterators = [wrap(i) for i in iterables]
    counter = iter(iterators)
    next(counter)
    while True:
        yield [next(i) for i in iterators]
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号