There are some problems which are just to slow when run in R directly, and one has to resort to lower level languages.
inline
, in conjunction with Rcpp
, provides support for embedding C++ code, which may make for somewhat easier coding.
library(inline)
library(Rcpp, warn.conflicts=FALSE)
cppFunction('NumericVector rowSumsC(NumericMatrix x) {
int nrow = x.nrow(), ncol = x.ncol();
NumericVector out(nrow);
for (int i = 0; i < nrow; i++) {
double total = 0;
for (int j = 0; j < ncol; j++) {
total += x(i, j);
}
out[i] = total;
}
return out;
}')
set.seed(1014)
x <- matrix(sample(100), 10)
rowSumsC(x)
## [1] 458 558 488 458 536 537 488 491 508 528
Notice that this time the function rowSumsC()
is fully declared in the string (not just its body), and that cppFunction()
creates an R function of the same name that we can call.
Such functions are likely to be faster than careful R equivalents, and also competitive with stripped-down versions.
library(microbenchmark)
microbenchmark(rowSumsC(x),
rowSums(x),
.rowSums(x, 10, 10),
times=100)
## Unit: microseconds
## expr min lq median uq max neval
## rowSumsC(x) 2.568 3.001 3.641 3.827 10.754 100
## rowSums(x) 6.343 7.018 7.771 8.210 31.146 100
## .rowSums(x, 10, 10) 1.191 1.530 1.639 1.845 4.474 100
The inline
package also allows you to embed C functions directly into your R code, and will compile them for you. The details are complicated.
Here’s a simple example taken from Hadley Wickham’s Advanced R wiki.
add <- cfunction(sig = c(a = "integer", b = "integer"),
body = "
SEXP result = PROTECT(allocVector(REALSXP, 1));
REAL(result)[0] = asReal(a) + asReal(b);
UNPROTECT(1);
return result;
")
add(1, 5)
## [1] 6
The main arguments are body
, a string which gives the body of the C function, and sig
, which gives its argument signature.
Objects of class SEXP
are R objects (S expressions). There are various convenience functions for editing them. The most useful types are:
REALSXP
: numeric vectorINTSXP
: integer vectorLGLSXP
: logical vectorSTRSXP
: character vectorVECSXP
: listR’s automatic garbage collection may attempt to delete R objects unless we prevent it from doing so; this is the purpose of the PROTECT()
commands in the C code. Calling PROTECT()
adds the object onto a protected stack, and UNPROTECT(1)
removes one (i.e.
the most recent) protected item from the stack. You should unprotect all objects before returning.