script to time limit process on posix_问答_开发者

I need to call system() and popen() to run a command cmd, such that the process is time limited. The result code must be the same if the process completes, otherwise it must be possible to detect that it timed out. Must run on posix systems (at least Linux and OSX).开发者_JAVA百科

At least on OSX sh -c "ulimit -t n; cmd" does not work from interactive prompt, (ulimit -t n; cmd) works but limits the command process too (making that shell useless). That might not matter.

An external script is undesirable: it makes the program using it hard to move, were I willing to put up with that, I could just write a C program.

My alternative is to use fork()/exec() inside my program: I can do that but the code is rather ugly (requires at least two forks, messing with file descriptors), and will only run an executable.

Linux has a timed-process program but it isn't available on OSX (and I can't find the source code). A similar question was asked on SO with slightly different requirements: I need the correct return code and it has to work with popen().

There's a Perl script here that preserves the exit code of a command that completes or returns 255 if it times out. It works similarly to timeout which was written by Wietse Venema (author of Postfix) and first released as a part of SATAN.

Here is the C source to a different version which is part of GNU coreutils as of version 7 (2008-10-05) [beta].

There is a program 'timeout' on GNU coreutils which can easily be compiled for MacOS X (I found it on my machine, looking for my own program of the same name).

Alternatively, I have a version that I originally wrote circa 1989. There are a couple of tricks that were needed to make it work properly on MacOS X. (Contact me if you need the code - see my profile.)

Here is a shell script which approximates what you want. It cannot meet all your requirements: specifically, it cannot relay the exit status back reliably. It is already contorted enough to make the C program trivial. The shell (bash, at any rate) does not help by not re-evaluating '$$' (the current process ID) for sub-shells; it persists in thinking that the process ID of sub-shells is the same as the parent process.

#!/bin/bash
# timeout -t time [-s signal] cmd [arg ...]

usage()
{
    echo "Usage: $(basename $0 .sh) -t time [-s signal] cmd [arg ...]" 1>&2
    exit 1
}

signal=15
while getopts t:s: opt "$@"
do
    case $opt in
    (t) time=$OPTARG;;
    (s) signal=$OPTARG;;
    (*)  usage;;
    esac
done
shift $((OPTIND - 1))
[ $# -ne 0 ]   || usage
[ -n "$time" ] || usage

pid_top=$$
# Run the command
(
    ("$@") &
    pid_sub=$!
    echo $pid_sub > .pid.$pid_top
    trap "kill $signal $pid_sub 2>/dev/null;
          kill 15 $pid_top 2>/dev/null; exit 1" 15
    wait
    kill $signal $pid_top 2>/dev/null
) &
pid_shell=$!
pid_cmd=$(cat .pid.$pid_top; rm -f .pid.$pid_top)
# Watchdog timer
(sleep $time; kill $signal $pid_cmd 2>/dev/null;
 kill -15 $pid_shell $pid_top) &
pid_watch=$!
# Cleanup!
trap "kill $signal $pid_cmd 2>/dev/null;
      kill 15 $pid_shell $pid_watch 2>/dev/null; exit 1" 15
wait

The argument processing logic is basically standard - you must specify a time, you may specify a signal, you must specify a command, you may specify arguments for the command.

The main shell script notes its own process ID in variable pid_top for convenience of the sub-shells (though '$$' probably would work).

The main shell then runs a background sub-shell to run the command, but there are shenanigans necessary here. First, the sub-shell runs the actual command in background, capturing the sub-process ID in pid_sub. Then it echoes that PID to the file '.pid.$pid_top', so that the parent shell can read it and arrange to send death threats to the actual command if the watchdog times out. It then arranges to trap signal 15 (SIGTERM); on receipt of such a signal, it kills the actual command with the signal requested on the command line, and also sends a terminate signal to the parent process. Then it goes into a wait. If the command completes, it sends a terminate signal to the parent process, letting the parent know that all is OK.

Back in the main shell, it captures the process ID of the background sub-shell in pid_shell and also of the actual command in pid_cmd. Then it runs another sub-shell, the watchdog process. The watchdog arranges sleeps for the timeout period, and when that time is up, sends a signal to the actual command, and a terminate signal to the parent process.

So, now there are three processes running in the background (ouch - this is confusing); the actual command, the sub-shell waiting for the command to complete, and the watchdog timer.

The main shell captures the PID of the watchdog and arranges to trap signal 15. On receipt of the signal, it sends the relevant terminate signal to the actual command, and a terminate signal to both the sub-shell and the watchdog, even though at least one of these does not exist.

The main shell finally goes into a wait for its children to die. It never actually returns normally from that wait; a signal wakes it and the trap executes, but the net result is that it hangs around until the processes have died.

Then it exits...

To get an accurate status from the executed command is hard. The trouble is that if the shell runs a process in the background, you can't get its exit status; but if you run it synchronously (so you can get the exit status), then the watchdog and the main process don't know its PID and can't ensure it terminates when the timer runs out. I may just be missing the obvious - I certainly hope so because that script is ghastly! For myself, I'll stick with the C program; it uses just two processes and gets the actual command status back cleanly.