I have been trying all kinds of solution to the following problem; to no avail.
I have a large number of (python) modules/scripts and a distinguished script, K.py .
When K.py is executed, it generates some information, say a country name. Now, amongst the other modules (hundreds) there will be modules that can be executed with the information (country name, for this example) generated by K.py passed to them as input. Recursively, each of the modules above will generate some information (town names, street numbers, etc.), that can serve as input for other modules, and so on .. This will of course result in a binary tree of scripts executed ..
Points to note.
- the modules/scripts (hundreds) above may run independently (they don't inter-depend in any way whatsoever)
I should be able to put a verdict when all modules have finished execution (that's running K.py must block until the triggered binary tree of excecuting modules is 'joined').
If, for each info I, and runnable script S (that is, S may run with I as inpu开发者_JS百科t), I decide to create a new, thread, I might end up with an exponential number of threads (No ?)
How can I use python threads (any of the APIs) to 'safely' implement a solution ? (pseudo_code ?)
Thanks in advance for your wisdom.
The usual way to solve this is to create a worker queue and store the single jobs there. So you need some kind of programatical representation of the work that should be done by one thread.
If you have that you can use the multiprocessing package that offers a "thread" pool (see 16.3.1.5.) and a multithreaded queue to store the jobs.
Now every process takes one job from the queue, executes it - maybe adds new jobs to the queue - and when it finishes takes the next one. You're finished when the queue is empty.
Note that this uses the multiprocessing package, because at least in CPython with the GIL a multithreaded python program is only advantageous in case of large IO or other blocking activities.
精彩评论