Currently, I have a driver program that runs several thousand instances of a "payload" program and does some post-processing of the output. The driver currently calls the payload program directly, using a shell()
function, from multiple threads. The shell()
function executes a command in the current working directory, blocks until the command is finished running, and returns the data that was sent to stdout by the command. This works well on a single multicore machine. I want to modify the driver to submit qsub
jobs to a large compute cluster instead, for more parallelism.
Is there a way to make the qsub
command output its results to stdout instead of a file and block until the job is finished? Basically, I want it to act as much 开发者_StackOverflow社区like "normal" execution of a command as possible, so that I can parallelize to the cluster with as little modification of my driver program as possible.
Edit: I thought all the grid engines were pretty much standardized. If they're not and it matters, I'm using Torque.
You don't mention what queuing system you're using, but SGE supports the '-sync y' option to qsub which will cause it to block until the job completes or exits.
In TORQUE this is done using the -x and -I options. qsub -I specifies that it should be interactive and -x says run only the command specified. For example:
qsub -I -x myscript.sh
will not return until myscript.sh finishes execution.
In PBS you can use qsub -Wblock=true <command>
精彩评论