Python generator pre-fetch?_问答_开发者_运维开发者技术经验分享

I have a generator that takes a long time for each iteration to run. Is there a standard way to have it yield a value, then generate the next value while waiting to be called again?

The generator wou开发者_JAVA技巧ld be called each time a button is pressed in a gui and the user would be expected to consider the result after each button press.

EDIT: a workaround might be:

def initialize():
    res = next.gen()

def btn_callback()
    display(res)
    res = next.gen()
    if not res:
       return

If I wanted to do something like your workaround, I'd write a class like this:

class PrefetchedGenerator(object):
    def __init__(self, generator):
         self._data = generator.next()
         self._generator = generator
         self._ready = True

    def next(self):
        if not self._ready:
            self.prefetch()
        self._ready = False
        return self._data

    def prefetch(self):
        if not self._ready:
            self._data = self._generator.next()
            self._ready = True

It is more complicated than your version, because I made it so that it handles not calling prefetch or calling prefetch too many times. The basic idea is that you call .next() when you want the next item. You call prefetch when you have "time" to kill.

Your other option is a thread..

class BackgroundGenerator(threading.Thread):
    def __init__(self, generator):
        threading.Thread.__init__(self)
        self.queue = Queue.Queue(1)
        self.generator = generator
        self.daemon = True
        self.start()

    def run(self):
        for item in self.generator:
            self.queue.put(item)
        self.queue.put(None)

    def next(self):
            next_item = self.queue.get()
            if next_item is None:
                 raise StopIteration
            return next_item

This will run separately from your main application. Your GUI should remain responsive no matter how long it takes to fetch each iteration.

No. A generator is not asynchronous. This isn't multiprocessing.

If you want to avoid waiting for the calculation, you should use the multiprocessing package so that an independent process can do your expensive calculation.

You want a separate process which is calculating and enqueueing results.

Your "generator" can then simply dequeue the available results.

You can definitely do this with generators, just create your generator so that each next call alternates between getting the next value and returning it by putting in multiple yield statements. Here is an example:

import itertools, time

def quick_gen():
    counter = itertools.count().next
    def long_running_func():
        time.sleep(2)
        return counter()
    while True:
        x = long_running_func()
        yield
        yield x

>>> itr = quick_gen()
>>> itr.next()   # setup call, takes two seconds
>>> itr.next()   # returns immediately
0
>>> itr.next()   # setup call, takes two seconds
>>> itr.next()   # returns immediately
1

Note that the generator does not automatically do the processing to get the next value, it is up to the caller to call next twice for each value. For your use case you would call next once as a setup up, and then each time the user clicks the button you would display the next value generated, then call next again for the pre-fetch.

I was after something similar. I wanted yield to quickly return a value (if it could) while a background thread processed the next, next.

import Queue
import time
import threading

class MyGen():
    def __init__(self):
        self.queue = Queue.Queue()
        # Put a first element into the queue, and initialize our thread
        self.i = 1
        self.t = threading.Thread(target=self.worker, args=(self.queue, self.i))
        self.t.start()

    def __iter__(self):
        return self

    def worker(self, queue, i):
        time.sleep(1) # Take a while to process
        queue.put(i**2)

    def __del__(self):
        self.stop()

    def stop(self):
        while True: # Flush the queue
            try:
                self.queue.get(False)
            except Queue.Empty:
                break
        self.t.join()

    def next(self):
        # Start a thread to compute the next next.
        self.t.join()
        self.i += 1
        self.t = threading.Thread(target=self.worker, args=(self.queue, self.i))
        self.t.start()

        # Now deliver the already-queued element
        while True:
            try:
                print "request at", time.time()
                obj = self.queue.get(False)
                self.queue.task_done()
                return obj
            except Queue.Empty:
                pass
            time.sleep(.001)

if __name__ == '__main__':
    f = MyGen()
    for i in range(5):
#        time.sleep(2) # Comment out to get items as they are ready
        print "*********"
        print f.next()
        print "returned at", time.time()

The code above gave the following results:

*********
request at 1342462505.96
1
returned at 1342462505.96
*********
request at 1342462506.96
4
returned at 1342462506.96
*********
request at 1342462507.96
9
returned at 1342462507.96
*********
request at 1342462508.96
16
returned at 1342462508.96
*********
request at 1342462509.96
25
returned at 1342462509.96