Speeding up Code

Vectorization
apply() and friends

Dealing with algorithmic issues is the most fundamental way to speed up your code, and will work in any language. However R is an esoteric language, and there are many more (often counterintuitive) methods you can use to reduce running time.

Vectorization

For the purposes of speed, the most important things to avoid (where possible) in R are loops: for() or while(). This is because in R calling a function has an overhead computational cost which, though small in absolute terms, becomes significant if repeated many times.

mean1 = function(x) {
  out = 0
  for (i in seq_along(x)) out = out + x[i]

  return(out/length(x))
}

mean2 = function(x) {
  out = sum(x)/length(x)
  
  return(out)
}

x = rnorm(1e4)
microbenchmark(mean1(x), mean2(x))

## Unit: microseconds
##      expr      min       lq  median      uq     max neval
##  mean1(x) 2637.371 2788.990 2885.62 3331.95 3923.97   100
##  mean2(x)    8.557    8.972   10.04   11.08   24.39   100

There is no clever algorithmic trick here: the mathematical computations being performed are the same; it is simply very inefficient to call the function + so many times in R. mean2() calls the built-in function sum() once, which dispatches the problem to much more efficient lower-level code.

Example

Suppose we want to multiply a diagonal matrix \(D\) with non-zero entries \(d_{ii}\) by a general matrix \(A = (a_{ij})\). The outcome will be a matrix \(C\) with entries \(c_{ij} = d_{ii} a_{ij}\); clearly this should be quicker to compute than the general case \(\sum_k d_{ik} a_{kj}\), so we could write a special function to perform this operation:

multdiag = function(d, A) {
  C = matrix(0, nrow(A), ncol(A))
  for (i in seq_len(nrow(A))) {
    C[i,] = d[i]*A[i,]
  }
  C
}

Now, how does it compare to naïve matrix multiplication?

d = rnorm(100)
D = diag(d)
A = matrix(rnorm(1e4), 100, 100)

microbenchmark(D %*% A, multdiag(d,A), d*A)

## Unit: microseconds
##            expr   min     lq median     uq    max neval
##         D %*% A 660.4 665.14 680.16 703.21 1487.5   100
##  multdiag(d, A) 391.0 396.84 412.14 450.36 1195.0   100
##           d * A  12.6  13.25  13.68  14.52  602.2   100

The function multdiag() is faster than the naïve approach, but because it has to call the multiplication and subset operation a lot of times in the loop, it isn’t as fast as we might hope (after all, it reduces the number of mathematical computations by a factor of around 100).

In the third option, each entry of \(A\) is multiplied by the corresponding entry in \(\boldsymbol d\); this vector is recycled, so we get the answer we want. We only have to call the function * once, and the result is obtained around 40 times more quickly than with the naïve approach.

The lesson here is that it is not sufficient to consider only algorithmic speed in order to get fast code in R.

apply() and friends

Functions such as apply() and lapply() are still loops, and although they may make your code tidier and easier to follow, they are unlikely to speed it up very much.

getMeans = function(x) {
  out = numeric(length(x))
  for (i in seq_along(x)) {
    out[i] = mean(x[[i]])
  }
  out
}

# get a list of 100 vectors of 10 normals
x = replicate(100, rnorm(10), simplify=FALSE)
x[[1]]

##  [1] -0.2932 -0.4165  1.2835 -0.4433  0.2263 -0.9597 -0.7344  0.2321  0.6717  1.3768

microbenchmark(getMeans(x), sapply(x, mean))

## Unit: microseconds
##             expr   min    lq median    uq    max neval
##      getMeans(x) 643.7 679.2  691.7 724.8 1573.5   100
##  sapply(x, mean) 491.9 513.4  527.8 552.1  872.8   100

Sometimes it’s better to find a clever way to vectorize a function instead.

## version assuming each entry in list is of same length
getMeans2 = function(x) {
  M = matrix(unlist(x), ncol=length(x))
  return(colMeans(M))
}

microbenchmark(getMeans(x), sapply(x, mean), getMeans2(x))

## Unit: microseconds
##             expr    min     lq median     uq    max neval
##      getMeans(x) 631.64 644.29 652.44 667.15 1492.5   100
##  sapply(x, mean) 485.53 493.46 502.20 515.28 1333.7   100
##     getMeans2(x)  21.23  23.85  24.81  25.68  130.9   100