B.

Native Extensions for Perl without Smoke and Mirrors

The standard mechanism by which one can hook native extensions into Perl, dubbed XS for "eXternal Subroutine", is often criticized for its arcane syntax and for obscuring how the interaction with the Perl VM works while still requiring some familiarity from the programmer user. This article aims to provide a bottom-up approach to understanding how native extensions interact with Perl while eschewing XS altogether, hopefully providing complementary learnings.

This is not to discourage anybody from learning XS to extend Perl. Quite the opposite, XS can be a fantastically useful tool that automatically generates most of the code we will analyze below. It also provides an important portability and compatibility layer: The code in this article is not guaranteed to work on a future version of Perl and likely won't work on all platforms that Perl supports. In short, please do use XS for your real-world problems.

Off we go. Note that all code snippets in this article can be found as a complete CPAN distribution on github. What we'll do first is come up with a C function that works the same as an XS-generated XSUB[1]. The function signature is:

void my_sum_xs_is_evil(pTHX_ CV *cv)

The strange pTHX_ token is a Perl API macro and stands for "parameter for threading context". It will either be compiled out altogether if your Perl does not support ithreads[2], or it will be expanded to PerlInterpreter *my_perl,. Note the underscore. Including it results in the trailing comma. The cv parameter is basically the Perl subroutine reference that represents our XSUB from the point of view of Perl.

Before we dive into the actual function implementation, it pays off to briefly consider how Perl passes arguments to functions. Basically, it uses about 666 stacks flying in close formation. But we can get away with only considering two of them: The argument stack (also called the Perl stack) and the mark stack. The argument stack holds pointers to the actual function arguments, first to last. Below our data, it may also hold pointers to the arguments of our caller function, so we need the mark stack to tell us where our function's arguments begin: The mark stack holds integer offsets into the argument stack. When our function is invoked, Perl will remember the current position in the argument stack by pushing it as an integer onto the mark stack. Then it pushes the function arguments onto the argument stack, first to last. This means to get our data, we have to take the top element off the mark stack. From this offset into the argument stack all the way to the top of the argument stack, we can then access our parameters:

void my_sum_xs_is_evil(pTHX_ CV *cv) {
  /* Get the top "mark" offset from the stack of marks. */
  I32 ax = *PL_markstack_ptr--;
  /* PL_stack_base is the pointer to the bottom of the
   * argument stack. */
  SV **mark = PL_stack_base + ax;

  /* Local copy of the global Perl argument stack pointer.
   * This is the top of the stack, not the base! */
  SV **sp = PL_stack_sp;

  /* And finally, the number of parameters for this function. */
  I32 items = (I32)(sp - mark);

  int i;
  double sum = 0.;

Followed that? We now have a pointer to the first argument (in the mark variable) as well as the number of parameters. The additional declarations will be used for our actual function below. Now on to actually using the function parameters and calculating their sum. (Remember? That was the point of the entire exercise.)

  /* Move stack pointer back by number of arguments.
   * Basically, this means argument access by increasing index
   * in "first to last" order instead of access in
   * "last to first" order by using negative offsets. */
  sp -= items;

  /* Go through arguments (as SVs) and add their *N*umeric *V*alue to
   * the output sum. */
  for (i = 0; i < items; ++i)
    sum += SvNV( *(sp + i+1) ); /* sp+i+1 is the i-th arg on the stack */

If we move the stack pointer back by the number of arguments, we can conveniently access the i-th argument by simple pointer arithmetic. Now, all we have to do is return one floating point number as a Perl SV. For this, it's helpful to know that Perl expects us to put return values in the same stack elements that we received the arguments in. If necessary, we can put additional return values in the stack slots after (or above) the ones that held the arguments.

  const IV num_return_values = 1;
  /* Make sure we have space on the stack (in case the function was
   * called without arguments) */
  if (PL_stack_max - sp < (ssize_t)num_return_values) {
    /* Oops, not enough space, extend. Needs to reset the
     * sp variable since it might have caused a proper realloc. */
    sp = Perl_stack_grow(aTHX_ sp, sp, (ssize_t)num_return_values);
  }

  /* Push return value on the Perl stack, convert number to Perl SV. */
  /* Also makes the value mortal, that is avoiding a memory leak. */
  *++sp = sv_2mortal( newSVnv(sum) );

  /* Commit the changes we've done to the stack by setting the global
   * top-of-stack pointer to our modified copy. */
  PL_stack_sp = sp;

  return;
}

First we make sure that the argument stack has enough space for our return value. Then we push a new, mortalized Perl SV (holding our sum) onto the argument stack. Umm, mortalized? Well, I lied earlier. The example requires three of Perl's stacks. Perl uses reference counting to manage its memory[3]. That doesn't come for free. As a very significant optimization (and an equally significant source of bugs) the Perl argument stack does not own a reference count of the values that are referenced from the stack. This poses a problem when returning a new value: The only reference to it will be on the argument stack, which refuses to own it! To work around this problem, Perl has the mortal stack (or tmps stack). The mortal stack owns a reference to each item that's pushed onto the stack. sv_2mortal above takes the newly created SV (which is created with a reference count of 1) and pushes a pointer to the SV onto the mortal stack. At the end of each executed statement, the mortal stack is cleaned up and the reference count of each item on the stack is decremented, possibly causing the item to be freed[4]. Thus the name: items on the stack are mortal, soon to pass away.

This is really all it takes to write an XSUB from scratch, without using the actual XS glue language. You lose the benefits of having automatic type-mapping for more complex types than just numbers, as well as many other useful bits, of course. The rest of the article will be concerned with actually hooking our XSUB into Perl. Again, no smoke and mirrors, without resorting to XS arcana where possible.

Our freshly devised C function needs to be compiled and linked into a shared library. For this purpose, we use the normal Perl module toolchain that otherwise builds XS modules. After writing the function to a .c file, we can get away with a minimal Makefile.PL configure-script such as this:

use 5.008005;
use ExtUtils::MakeMaker;
WriteMakefile(
  NAME   => 'XS::WithoutXS',
  OBJECT => '$(O_FILES)', # link all the object files
);

Note that I've chosen to call the CPAN distribution XS-WithoutXS and the accompanying module/namespace will be XS::WithoutXS. The Perl toolchain takes care of compiling the .c files in the distribution directory and linking them (normally together with the output of the XS compilation) into a shared library blib/arch/auto/XS-WithoutXS/WithoutXS.so. Great. All that's left to do is write some Perl code to actually invoke the function. Normally, the standard XSLoader module will do all of this for us, but let's do it by hand. First, the usual Perl module boilerplate:

package XS::WithoutXS;
use strict;
use warnings;
use DynaLoader ();

my $SharedLibrary;

The most noteworthy bit is that we're loading the DynaLoader module. It is a low-level module provided by the Perl core that provides a portable way of locating symbols in shared libraries. Now, we need to locate the shared library that holds our manually-crafted XSUB.

sub setup_so_access {
  my $pkg = shift;
  return if defined $SharedLibrary;

  my @pkg_components = split /::/, $pkg;

  my $pkg_path = join "/", @pkg_components;
  my @dirs = (map "-L$_/auto/$pkg_path", @INC);
  my (@mod_files) = DynaLoader::dl_findfile(@dirs, $pkg_components[-1]);
  die "Failed to locate shared library for '$pkg'"
    if not @mod_files;

  $SharedLibrary = DynaLoader::dl_load_file($mod_files[0]);
}

Much of this is really just to construct the path to the shared library that the toolchain generated for us. Then we add it to Perl's module search path and use the dl_load_file to let DynaLoader find it. We then link the first one that's found into our process. What's left is the logic to locate a symbol in the shared library and expose it to Perl as an XSUB.

sub newXS {
  my ($perl_func_name, $full_symbol_name, $filename) = @_;
  my $sym = DynaLoader::dl_find_symbol($SharedLibrary, $full_symbol_name);
  die "Failed to locate $full_symbol_name" if not defined $sym;
  DynaLoader::dl_install_xsub($perl_func_name, $sym, $filename);
}

All the magic is happening in DynaLoader again: We locate the symbol by name in the shared library and then call dl_install_xsub, which will create a new CV (remember: a CV is a code/sub reference), and install it as the Perl function $perl_func_name. The filename is used for error reporting only. Thus we can wrap things up by a call to the two functions each:

setup_so_access(__PACKAGE__); # assuming package==dist name, see also below
newXS("my_sum", "my_sum_xs_is_evil", "this_is_not_xs.c");

And thus we can call it from Perl:

$ perl Makefile.PL
$ make
$ perl -Mblib -le 'use XS::WithoutXS; print XS::WithoutXS::my_sum(5,3,2,1);'
11

Voila. Writing extensions in C isn't that hard, is it? But tedious, you say? Indeed it is. That is why we have XS, the glue language, and its many tools. To make your life easier. Now that you understand how the basic interaction between your Perl and C code works, you can safely reach for the more convenient tools without worrying too much about the amount of magic under the hood.


[1] Apologies for hand-waving here. With "works", I mean that it has the same C function signature and when it's invoked, it will play nice with Perl and do all the things that Perl expects of a C function invoked via the extension mechanism. In XS jargon, such a function is called an "XSUB". I'll use that convention in the rest of the text.

[2] Technically, it's compiled out if your perl was not compiled with `MULTIPLICITY` support (a superset of ithread support), that is, if it does not support having multiple Perl interpreters running in the same process. Also, for completeness, let me point out that `pTHX` and `pTHX_` are accompanied by `aTHX` and `aTHX_`, the symmetrical incantations you need on the callsite of the function that has a `pTHX` in its signature.

[3] A previous article on this blog discussed debugging memory access problems and also gives a brief introduction into how Perl does reference counting.

[4] Technically, I'm oversimplifying here. But to be entirely honest, I'd have to introduce the "save" and "scope" stacks as well. But that doesn't actually have much practical relevance for what we're trying to do. So please excuse the minor inaccuracy.

comments powered by Disqus