B. dev blog

Careers at booking.com

The Signal Handling Blues

Unix/POSIX signal handling is tricky at best of times. At worst, it drives people to write articles about it.

In Perl, signal handling feels particularly messy: There can only be one signal handler per signal type at any given time. The handlers are stored in a global array %SIG, so in order to catch SIGINT, you do

$SIG{INT} = sub {
    my $signame = shift;
    # cleanup here
    die "Caught a $signame, throwing exception.";
};

There are multiple problems with this. Arguably the worst is related to when signals fire. Perl now has "safe" signals that get handled only at OP boundaries. They will no longer crash your perl-VM hard, but the flip side is that a long running OP (such as a very slow regular expression) will not be interrupted. System calls, on the other hand, will be liberally interrupted by signals and cannot always be resumed. Few code bases actually handle the particular failure mode in which system calls need to be retried after such a "soft" failure. In short: Signals are fraught with peril.

Let's go back to how signals are exposed to the Perl programmer: A single global hash contains the signal names as hash keys and the handlers as values. Perl also allows you to define a couple of special-purpose, non-POSIX signals: $SIG{__DIE__} and $SIG{__WARN__} that can catch Perl exceptions and warnings.

At Booking.com, we have a few million lines of application source code. Some of the lower layers of our infrastructure necessarily have to work with catching exceptions to inject them into the proper logging and monitoring tools. Others might have been accidentally left in after debugging. Some of the hardest-to-debug problems are when your exception or warning messages are eaten up by some $SIG{__DIE__} or $SIG{__WARN__} handler. Almost by definition, signal handlers they are a gnarly form of action-at-a-distance. At this point, and in a code base this large, you're pretty much down to guessing where your warning message was swallowed. Thankfully, this happens rarely, but when it does, you need to bring out a sledge hammer.

Enter Devel::TrackSIG!

Devel::TrackSIG is a small module that, when loaded early (ideally via perl -MDevel::TrackSIG yourprogram.pl), will record all modifications to the global list of signal handlers. Thus, when you are debugging the following code:

warn "Foo"; # WHY DOESN'T THIS GO TO MY SCREEN?

And find out where the signal handler was defined that is swallowing your warning:

print STDERR tied(%SIG)->get_source('__WARN__');
warn "Foo"; # WHY DOESN'T THIS GO TO MY SCREEN?

Or even find the locations of all signal handlers that were defined:

tied(%SIG)->dump_all_sources;

The module also allows you to print a stack trace whenever %SIG is modified. Simply load it as follows:

perl -MDevel::TrackSIG=report_write_access,1 yourapp.pl

In the same CPAN distribution as Devel::TrackSIG, there is also Devel::TrackGlobalScalar for applying the same tracking to one of Perl's special global variables. For example, if you wonder why $/ was suddenly changed to a different record separator with global effect, then this module (and git blame) can tell you who is the culprit.

In summary, this kind of debugging module is rarely needed, but when signals are almost driving you to abolish programming altogether, they can be indispensable to avoid the worst.