7.4 The byte compiler

The ** compiler** package, written by R Core member Luke Tierney has been part of R since version 2.13.0. Since R 2.14.0, all of the standard functions and packages in base R are pre-compiled into byte-code. This is illustrated by the base function mean:

mean
## function (x, ...) 
## UseMethod("mean")
## <bytecode: 0x70ef3d0>
## <environment: namespace:base>

The third line contains the bytecode of the function. This means that the compiler package has translated the R function into another language that can be interpreted by a very fast interpreter.

The compiler package allows R functions to be compiled, resulting in a byte code version that may run faster20. The compilation process eliminates a number of costly operations the interpreter has to perform, such as variable lookup. Amazingly the compiler package is almost entirely pure R, with just a few C support routines.

7.4.1 Example: the mean function

The compiler package comes with R, so we just need to load the package in the usual way

library("compiler")

Next we create an inefficient function for calculating the mean. This function takes in a vector, calculates the length and then updates the total variable.

my_mean = function(x) {
  total = 0
  n = length(x)
  for(i in 1:n)
    total = total + x[i]/n
  total
}

This is clearly a bad function and we should just mean function, but it’s a useful comparison. Compiling the function is straightforward

cmp_mean = cmpfun(my_mean)

Then we use the benchmark function to compare the three variants

## Generate some data
x = rnorm(100)
benchmark(my_mean(x), cmp_mean(x), mean(x), 
          columns=c("test", "elapsed", "relative"),
          order="relative", replications=5000)

The compiled function is around seven times faster than the uncompiled function. Of course, the native mean function is faster, but the compiling does make a significant difference (figure 7.4).

Comparsion of mean functions.

Figure 7.4: Comparsion of mean functions.

7.4.2 Compiling code

There are a number of ways to compile code. The easiest is to compile individual function using cmpfun, but this obviously doesn’t scale. If you create a package, then you automatically compile the package on installation by adding

ByteCompile: true

to the DESCRIPTION file. Most R packages installed using install.packages are not compiled. We can enable (or force) packages to be compiled by starting R with the environment variable R_COMPILE_PKGS set to a positive integer value.

A final option to use just-in-time (JIT) compilation. The enableJIT function disables JIT compilation if the argument is 0. Arguments 1, 2, or 3 implement different levels of optimisation. JIT can also be enabled by setting the environment variable R_ENABLE_JIT, to one of these values.