B.

IO::Socket::Timeout: socket timeout made easy

Without network operations, running a website for booking accommodation online would be nearly impossible. Network operations can be anything from simple actions like talking to a browser with a user at the other end, to providing an API for our affiliates, or even writing the internal services that help maintain our systems.

Network operations are everywhere, and these are only a few examples of where we use them.

What is a network socket

Network communication typically happens through the use of sockets.

A network socket is one of the core software components of all modern operating systems. It is a software interface for network services provided by the operating system. It provides a uniformed way of opening a connection to this service and sends and receives data.

Network sockets are used everywhere in the IT world, and they allow us to communicate between different hosts or different programs on the same host. Despite having different kinds of network services (TCP and UDP being prominent examples), network sockets provide a common way to interact with them.

Here is an example of interacting with the Google.com website using IO::Socket::INET, the standard Perl socket module. (IO means Input/Output, and INET means Internet.)

# example 1
my $socket = IO::Socket::INET->new('google.com:80');
print {$socket} "GET / \n";
my $html = join '', <$socket>;

Interestingly, IO::Socket::INET is mostly used for its Object Oriented capable interface. The following example performs the same operations as the previous one, but in an object oriented way:

# example 2
my $socket = IO::Socket::INET->new(
    PeerHost => 'www.booking.com',
    PeerPort => 80,
);

$socket->print("GET / \n");
my $html = join '', $socket->getlines();

Why sockets timeouts are important

At Booking.com, the handling of requests in a timely manner is critical to the user experience, to operations, and ultimately to our business. To achieve speed and low latency, our platform involves different subsystems and resources are constantly requested.

It is essential that these systems reply quickly. It is also vital that we detect when one of them isn't replying fast enough, that way it can be flagged and a mitigating strategy can be found (such as using an alternative subsystem).

We use Redis in a lot of places: as a cache layer, queue system, and for storage. Redis offers very low latency, and we make use of this feature. However, with sockets, we don't always know immediately when a connection has been lost. Knowing a Redis server is unreachable 30 seconds after the fact - is 30 seconds too late. Our goal is to know this in under a second. For other cases it might be possible (or even mandatory) to allow a longer timeout. It really depends on the subsystems involved as well as the usage.

Most of the time these subsystems are queried using a network socket. So being able to detect that a subsystem is not reachable implies that the sockets provide a way to specify timeouts.

This is why having a fast and reliable platform relies on having sockets that support timeouts. Using a socket involves three main steps: connecting to the external server, reading and writing data from and to it, and, at some point, closing the connection. A socket timeout implementation should allow for setting the timeout at connections, and both reading and writing steps at the very least.

Connection timeout

IO::Socket provides a timeout method, and IO::Socket::INET provides a Timeout option. The Timeout option can be used to set a timeout on the connection to the server. For example, this is how we connect to a local HTTP server on port 80 with a connection timeout of 3 seconds:

my $socket = IO::Socket::INET->new(
    PeerHost => '127.0.0.1',
    PeerPort => 80,
    Timeout  => 3,
);

So far so good, but how do we deal with read or write timeouts? What if the server accepts the connection, but then at some point stops communicating? The client socket needs to realize this quickly. We need timeouts for this.

Read/Write timeouts via setsockopt

It is relatively easy to change the option of a socket to supply these timeouts. This is an example that works on GNU/Linux, given $timeout in (optionally fractional) seconds:

my $seconds  = int($timeout);
my $useconds = int( 1_000_000 * ( $timeout - $seconds ) );
my $timeout  = pack( 'l!l!', $seconds, $useconds );
$socket->setsockopt( SOL_SOCKET, SO_RCVTIMEO, $timeout )
# then use $socket as usual

The only problem is that it only works on some architecture and operating systems. A generic solution is better. Let's look at the available options on systems that do not support setsockopt.

Read/Write timeouts via select

Another more portable (albeit slower) way to simulate a timeout on a socket is to check if the socket is readable/writable with a timeout in a non-blocking way. select(2) can do this, and the Perl select() function can provide access to it.

Here is a simplified version of a function that returns true if we can read the socket with the given timeout:

sub _can_read {
    my ( $file_desc, $timeout ) = @_;
    vec( my $fdset = '', $file_desc, 1 ) = 1;
    my $nfound = select( $fdset, undef, undef, $timeout );
}

Using an external library

Yet another way is to use external modules or system calls, like epoll (via IO::Poll), libevent, or libev. To simplify things, it's common to use higher-level event-based modules like AnyEvent and POE. They make it easy to specify a timeout to any IO (Input/Output) operations.

This is an example using AnyEvent, which will set a connection timeout of 0.5 second and a read or write timeout of 0.01 second:

my $handle = AnyEvent::Handle->new (
    connect    => [ $host, $port ],
    on_prepare => sub { 0.5 },
    # ...
);

$handle->on_timeout( sub { say 'timeout occurred' } );
$handle->timeout(0.01);

While completely valid, it applies only to programs that use these event-based modules. It is useless to standard imperative programs. We need a method for providing timeout features to the standard socket API without changing the operation needed to require an event loop.

Provide a nice API

Let's step back for a moment. We have two ways to setup a timeout on a socket:

  • A one-time setting on the socket with setsocket.
  • A change to the way we interact with the socket with select.

We need to abstract these two ways of setting timeouts behind a simple and easy-to-use API. Let's consider this example:

my $socket = IO::Socket::INET->new( ... );
print {$socket} 'something';

(Please note that we don't use object-oriented notations on the socket.)

What we want is an easier way to set timeout on the $socket. For example this:

my $socket = IO::Socket::INET->new( ... );

# set timeouts
$socket->read_timeout(0.5);

# use the socket as before
print {$socket} 'something';

# later, get the timeout value
my $timeout = $socket->read_timeout();

when using setsockopt

If we can use setsockopt, setting the timeout using ->read_timeout(0.5) is easy. It can be implemented as a method that we add to IO::Socket::INET class, possibly by using a Role.

This method would just fire setsockopt with the right parameters, and save the timeout value into $socket for later retrieval. Then we can carry on using $socket as before.

One subtlety is that, because the $socket is not a classic hash reference instance, but an anonymous typeglob on a hash reference, instead of doing $socket->{ReadTimeout} = 0.5 we need to do ${*$socket}{ReadTimeout} = 0.5 ... but that's just an implementation detail.

when using select

If however the program is running in a situation where setsockopt can't be used, we have to resort to using the select method. That poses a problem. Because we're not using object oriented programming, the operation on the socket is not done via a method we could easily override, but directly using the built-in function print.

Overwriting a core function is not a good practice for various reasons. Luckily, Perl provides a clean way to implement custom behavior in the IO layer.

PerlIO layers

Perl Input/Output mechanism is based on a system of layers. It is documented in the perliol(1) man page.

What's the PerlIO API? It's a stack of layers that live between the system and the perl generic file-handle API. Perl provides core layers (such as :unix, :perlio, :stdio, and :crlf). It also provides extension layers (such as :encoding and :via).

These layers can be stacked and removed in order to provide more features (when layers are added) or more performance (when layers are removed).

The huge benefit is that no matter which layers are setup on a file handle or socket, the API doesn't change and the read/write operations are the same. Calls to them will go through the specified layers attached to the handle until they potentially reach the system calls.

Here is an example:

open my $fh, 'filename';
# for direct binary non-buffered access
binmode $fh, ':raw';
# specify that the file is in utf8, and enforce validation
binmode $fh, ':encoding(UTF-8)';
my $line = <$fh>;

The :via layer is a special layer that allows anyone to implement a PerlIO layer in pure Perl. Contrary to implementing a PerlIO layer in C, using the :via layer is rather easy: it is just a Perl class, with some specific methods. The name of the class is given when setting the layer:

binmode $fh, ':via(MyOwnLayer)';

Many :via layers already exist. They all start with PerlIO::via:: and are available on CPAN. For instance, PerlIO::via::json will automatically and transparently decode and encode the content of a file or a socket from and to JSON.

Back to the problem. We could implement a :via layer that makes sure that read and write operations on the underlying handle are performed within the given timeout.

Implementing a timeout PerlIO layer

A :via layer is a class that should start with PerlIO::via:: and implement a set of methods, like READ, WRITE, PUSHED, and POPPED - (see the PerlIO::via manual for more details).

Let's take the READ method as an illustration. This is a very simplified version. The real version handles EINTR and other corner cases.

package PerlIO::via::Timeout;
sub READ {
    my ( $self, $buf, $len, $fh ) = @_;
    my $fd = fileno($fh);

    # we use the same can_read as previously
    can_read( $fd, $timeout )
        or return 0;

    return sysread( $fh, $buf, $len, 0 );
}

The idea is to check if we can read on the filesystem using select in the given timeout. If not, return 0. If yes, call the normal sysread operation. It's simple and it works great.

We've just implemented a new PerlIO layer using the :via mechanism! A PerlIO layer works on any handle, including file and socket. Let's try it on a file-handle:

use PerlIO::via::Timeout;
open my $fh, '<:via(Timeout)', 'foo.html';
my $line = <$fh>;
if ( $line == undef && 0+$! == ETIMEDOUT ) {
  # timed out reading
  ...
} else {
  # we read one line fast enough, success!
  ...
}

I'm sure you can see that there is an issue in the code above. At no point do we set the read timeout value. The :via pseudo-layer doesn't allow us to easily pass a parameter to the layer creation. Though we can technically, we would not be able to change the parameter afterwards. If we want to be able to set, change, or remove the timeout on the handle at any time, we need to somehow attach this information to the handle, and we need to be able to change it.

Add a properties to a Handle using InsideOut OO

A handle is not an object. We can't just add a new timeout attribute to a handle and then set or get it.

Luckily, the moment a handle is opened it receives a unique ID: its file descriptor. A file descriptor is not always unique because they are recycled and reused. Yet, if we know when a handle is opened and closed we can be sure that between these actions a file descriptor is given that uniquely identifies it.

The :via PerlIO layer allows us to implement PUSHED, POPPED, and CLOSE. These functions are called when the layer is added to the handle, when it's removed, and when the handle is closed. We can use these functions to detect if and when to consider the file descriptor as a unique ID for the given handle.

We can create a hash table as a class attribute of our new layer. Here the keys are file descriptors and the values are a set of properties on the associated handle -- essentially a basic implementation of Inside-Out OO - with the object not being its data structure only an ID. Using this hash table, we can associate a set of properties to a file descriptor and set the timeout value when the PerlIO layer is added. Like this:

my %fd_properties;

sub PUSHED {
    my ( $class, $mode, $fh ) = @_;
    $fd_properties{ fileno($fh) } = { read_timeout => 0.5 };
    # ...
}

By doing this when we remove the layer too, we can also implement a way to associate timeout values to the file-handle.

Wrapping up all the bits of code and features, the full package that implements this timeout layer, PerlIO::via::Timeout, is available on Github and CPAN.

Implement the API

We now have all the ingredients we need to implement the desired behavior. enable_timeouts_on will receive the socket and modify its class (which should be IO::Socket::INET or inherited from it) to implement these methods:

  • read_timeout: get/set the read timeout
  • write_timeout: get/set the write timeout
  • disable_timeout: switch off timeouts (while remembering their values)
  • enable_timeout: switch back on the timeouts
  • timeout_enabled: returns whether the timeouts are enabled

In order to modify the IO::Socket::INET class in a clean way, let's create a role and apply it to the class. In fact, let's create two roles: one that implements the various methods using setsockopt and another role that uses select (with PerlIO::via::Timeout).

A Role (sometimes known as Trait) provides additional behavior to a class in the form of composition. Roles provide introspection, mutual exclusion capabilities, and horizontal composition instead of the more widely used inheritance model. A class simply consumes a Role, receiving any and all behavior the Role provides, whether these are attributes, methods, method modifiers, or even constraints on the consuming class.

Detailing the implementation of the role mechanism here is a bit out of the scope, but it's still interesting to note that to keep IO::Socket::Timeout lightweight, we don't use Moose::Role or Moo::Role, but instead we apply a stripped down variant of Role::Tiny, which uses a single inheritance of a special class crafted in real time specifically for the targeted class. The code is short and can be seen here.

wrap it up

Use IO::Socket::Timeout to add read/write timeouts to any network socket created with IO::Socket::INET, on any platform:

# 1. Creates a socket as usual
my $socket = IO::Socket::INET->new( ... );

# 2. Enable read and write timeouts on the socket
IO::Socket::Timeout->enable_timeouts_on($socket);

# 3. Setup the timeouts
$socket->read_timeout(0.5);
$socket->write_timeout(0.5);

# 4. Use the socket as usual
my $data = <$socket>;

# 5. Profit!

Conclusion

IO::Socket::Timeout provides a lightweight, generic, and portable way of applying timeouts on sockets, and it plays an important role in the stability of the interaction between our subsystems at Booking.com. This wouldn't be possible without Perl's extreme flexibility.

Please be aware that there is a performance penalty associated with implementing IO layers in pure Perl. If you are worried about this, we recommend benchmarking when making the decision on whether to use it.

IO::Socket::Timeout is available on GitHub and CPAN.

comments powered by Disqus