Here we present a few points about functions that it’s easy to miss when learning about R.

… (ellipsis)

In order to write functions with arbitrary numbers of arguments, you need the ... operator, or ellipsis. You can play around with it inside the functions, and pass it to other functions (most usefully list()).

##' get minimum of maxima of a set of vectors
##'
##' @param ... numeric vectors
minimax = function(...) {
  args = list(...)
  maxs = sapply(args, max)
  return(min(maxs))
}

minimax(c(1,2,3), c(-1,2,4))
## [1] 3
minimax(c(1,2,3), c(-1,2,4), c(-5,2))
## [1] 2

do.call()

The converse problem may also arise. How do a write a function which uses ellipsis, with an arbitrary number of arguments? This is where do.call() comes in handy: it takes a list of arguments and applies a function to it.

##' vectorized version of ranges 
##'
##' @param x list of numeric vectors with respect of which to
##' calculate entrywise ranges
##'
##' @return numeric vector of same length as each entry of \code{x}
ranges = function(x) {
  mins = do.call(pmin, x)
  maxs = do.call(pmax, x)
  return(maxs-mins)
}
x = list(c(1,2,3), c(-1,2,4), c(9,-5,2))
ranges(x)
## [1] 10  7  2

Note that pmin and pmax do not accept lists as an argument, so pmin(x) is not the same as do.call(pmin, x).

For Speed

The do.call() operator can be useful for speeding things up. How can we find row-wise minima of a matrix? The obvious method is to use apply(), but this might be too slow. The pmin() function is fast, but does not treat matrices in the way we want: we need to give every column as a separate argument.

rowMins = function(x) {
  do.call(pmin, as.data.frame(x))
}

x = matrix(rnorm(1000),100,10)
microbenchmark(apply(x, 1, min), rowMins(x))
## Unit: microseconds
##              expr    min     lq median     uq   max neval
##  apply(x, 1, min) 184.22 192.62 201.60 225.39 970.8   100
##        rowMins(x)  76.34  80.82  86.19  91.47 169.1   100
Exercise

Try playing around with the dimension of the matrix x. Under what circumstances is rowMins() quicker than using apply(), and why?

Functional Programming

Ellipsis allows you to write functions which take other functions as arguments, and evaluate them. This is how apply() and their friends work:

##' repeatedly apply a function under different random seeds
rand_apply = function(FUN, ..., seeds) {
  out = list()
  for (i in seq_along(seeds)) {
    set.seed(seeds[i])
    out[[i]] = FUN(...)  # works for any function and argument set
  }
  out
}

rand_apply(function(n) mean(rnorm(n)), n=10, seeds = 1:5)
## [[1]]
## [1] 0.1322
## 
## [[2]]
## [1] 0.2112
## 
## [[3]]
## [1] -0.06714
## 
## [[4]]
## [1] 0.5665
## 
## [[5]]
## [1] -0.07885

The last line also illustrates anonymous functions: a function without a name.