I h开发者_运维百科ave a java program which does individual jobs e.g. takes in a file, does some processing on it and creates a new file. To run it I have to type the following in the command line.
java myprogram.jar -input myfile1.txt -output output/myfile1.txt
However i wish to batch process a few thousand files, so i would like to increment the number at the end of the myfile part of the string. So once the first job is finished, the second job will then start, and so on so forth. Rather than have thousands of instances of the java program running at the same time.
Any help would be appreciated.
Jon
I would use bash or something, but if you need to use python, you can use subprocess.call to do this:
from subprocess import call
for i in range(1,1000):
call(["java myprogram -input myfile%d.txt -output" % (i)])
This is a perfect use for a bash script (if you're in a *nix environment) or a .bat file if you are in Windows. Bash example:
#!/bin/bash
for i in {1..5}
do
java myprogram.jar -input myfile$i.txt -output output/myfile$i.txt
done
I would suggest just modifying your Java program to handle processing a whole directory so instead of handing over files pass over a directory to work on then the java program would process all of the files in the directory and write out several output files. Use some simple name mapping scheme for the output. That way you could exploit threads to handle several files at once should you want to boost speed for multi-core boxes. Also that keeps your overhead low because only 1 JVM is running.
You don't have to modify your Java program to do this. You could write a new program that leverages the code out of the JVM.
精彩评论