How do I run two python loops concurrently?_问答_开发者

Suppose I have the following in Python

# A loop
for i in range(10000):
    Do Task开发者_StackOverflow社区 A

# B loop
for i in range(10000):
    Do Task B

How do I run these loops simultaneously in Python?

If you want concurrency, here's a very simple example:

from multiprocessing import Process

def loop_a():
    while 1:
        print("a")

def loop_b():
    while 1:
        print("b")

if __name__ == '__main__':
    Process(target=loop_a).start()
    Process(target=loop_b).start()

This is just the most basic example I could think of. Be sure to read http://docs.python.org/library/multiprocessing.html to understand what's happening.

If you want to send data back to the program, I'd recommend using a Queue (which in my experience is easiest to use).

You can use a thread instead if you don't mind the global interpreter lock. Processes are more expensive to instantiate but they offer true concurrency.

There are many possible options for what you wanted:

use loop

As many people have pointed out, this is the simplest way.

for i in xrange(10000):
    # use xrange instead of range
    taskA()
    taskB()

Merits: easy to understand and use, no extra library needed.

Drawbacks: taskB must be done after taskA, or otherwise. They can't be running simultaneously.

multiprocess

Another thought would be: run two processes at the same time, python provides multiprocess library, the following is a simple example:

from multiprocessing import Process


p1 = Process(target=taskA, args=(*args, **kwargs))
p2 = Process(target=taskB, args=(*args, **kwargs))

p1.start()
p2.start()

merits: task can be run simultaneously in the background, you can control tasks(end, stop them etc), tasks can exchange data, can be synchronized if they compete the same resources etc.

drawbacks: too heavy!OS will frequently switch between them, they have their own data space even if data is redundant. If you have a lot tasks (say 100 or more), it's not what you want.

threading

threading is like process, just lightweight. check out this post. Their usage is quite similar:

import threading 


p1 = threading.Thread(target=taskA, args=(*args, **kwargs))
p2 = threading.Thread(target=taskB, args=(*args, **kwargs))

p1.start()
p2.start()

coroutines

libraries like greenlet and gevent provides something called coroutines, which is supposed to be faster than threading. No examples provided, please google how to use them if you're interested.

merits: more flexible and lightweight

drawbacks: extra library needed, learning curve.

Why do you want to run the two processes at the same time? Is it because you think they will go faster (there is a good chance that they wont). Why not run the tasks in the same loop, e.g.

for i in range(10000):
    doTaskA()
    doTaskB()

The obvious answer to your question is to use threads - see the python threading module. However threading is a big subject and has many pitfalls, so read up on it before you go down that route.

Alternatively you could run the tasks in separate proccesses, using the python multiprocessing module. If both tasks are CPU intensive this will make better use of multiple cores on your computer.

There are other options such as coroutines, stackless tasklets, greenlets, CSP etc, but Without knowing more about Task A and Task B and why they need to be run at the same time it is impossible to give a more specific answer.

from threading import Thread
def loopA():
    for i in range(10000):
        #Do task A
def loopB():
    for i in range(10000):
        #Do task B
threadA = Thread(target = loopA)
threadB = Thread(target = loobB)
threadA.run()
threadB.run()
# Do work indepedent of loopA and loopB 
threadA.join()
threadB.join()

You could use threading or multiprocessing.

How about: A loop for i in range(10000): Do Task A, Do Task B ? Without more information i dont have a better answer.

I find that using the "pool" submodule within "multiprocessing" works amazingly for executing multiple processes at once within a Python Script.

See Section: Using a pool of workers

Look carefully at "# launching multiple evaluations asynchronously may use more processes" in the example. Once you understand what those lines are doing, the following example I constructed will make a lot of sense.

import numpy as np
from multiprocessing import Pool

def desired_function(option, processes, data, etc...):
    # your code will go here. option allows you to make choices within your script
    # to execute desired sections of code for each pool or subprocess.

    return result_array   # "for example"


result_array = np.zeros("some shape")  # This is normally populated by 1 loop, lets try 4.
processes = 4
pool = Pool(processes=processes)
args = (processes, data, etc...)    # Arguments to be passed into desired function.

multiple_results = []
for i in range(processes):          # Executes each pool w/ option (1-4 in this case).
    multiple_results.append(pool.apply_async(param_process, (i+1,)+args)) # Syncs each.

results = np.array(res.get() for res in multiple_results)  # Retrieves results after
                                                           # every pool is finished!

for i in range(processes):
    result_array = result_array + results[i]  # Combines all datasets!

The code will basically run the desired function for a set number of processes. You will have to carefully make sure your function can distinguish between each process (hence why I added the variable "option".) Additionally, it doesn't have to be an array that is being populated in the end, but for my example, that's how I used it. Hope this simplifies or helps you better understand the power of multiprocessing in Python!