开发者

How can I debug a Perl program that suddenly exits?

开发者 https://www.devze.com 2023-01-02 15:27 出处:网络
I have Perl program based on IO::Async, and it sometimes just exits after a few hours/days without printing any error message whatsoever. There\'s nothing开发者_开发百科 in dmesg or /var/log either. S

I have Perl program based on IO::Async, and it sometimes just exits after a few hours/days without printing any error message whatsoever. There's nothing开发者_开发百科 in dmesg or /var/log either. STDOUT/STDERR are both autoflush(1) so data shouldn't be lost in buffers. It doesn't actually exit from IO::Async::Loop->loop_forever - print I put there just to make sure of that never gets triggered.

Now one way would be to keep peppering the program with more and more prints and hope one of them gives me some clue. Is there better way to get information what was going on in a program that made it exit/silently crash?


One trick I've used is to run the program under strace or ltrace (or attach to the process using strace). Naturally that was under Linux. Under other operating systems you'd use ktrace or dtrace or whatever is appropriate.

A trick I've used for programs which only exhibit sparse issues over days or week and then only over handfuls among hundreds of systems is to direct the output from my tracer to a FIFO, and have a custom program keep only 10K lines in a ring buffer (and with a handler on SIGPIPE and SIGHUP to dump the current buffer contents into a file. (It's a simple program, but I don't have a copy handy and I'm not going to re-write it tonight; my copy was written for internal use and is owned by a former employer).

The ring buffer allows the program to run indefinitely with fear of running systems out of disk space ... we usually only need a few hundred, even a couple thousand lines of the trace in such matters.


If you are capturing STDERR, you could start the program as perl -MCarp::Always foo_prog. Carp::Always forces a stack trace on all errors.


A sudden exit without any error message is possibly a SIGPIPE. Traditionally SIGPIPE is used to stop things like the cat command in the following pipeline:

cat file | head -10

It doesn't usually result in anything being printed either by libc or perl to indicate what happened.

Since in an IO::Async-based program you'd not want to silently exit on SIGPIPE, my suggestion would be to put somewhere in the main file of the program a line something like

$SIG{PIPE} = sub { die "Aborting on SIGPIPE\n" };

which will at least alert you to this fact. If instead you use Carp::croak without the \n you might even be lucky enough to get the file/line number of the syswrite, etc... that caused the SIGPIPE.

0

精彩评论

暂无评论...
验证码 换一张
取 消