```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(tidy = TRUE)
library(ggplot2)
theme_set(theme_bw())
# x coordinates for plotting.
xs = seq(-0.2, 1.2, length=500)
```
## Importance sampling
Note: the function $h$ was to be defined as the \emph{square} of the function given in the exercise. As it stands, the question would give a very small result since it has approximately equal positive and negative parts. See below for a solution to the square.
First we define the function
\[h(x) = \left[ \cos(50x) + \sin(20x) \right]^2.\]
```{r defh}
h <- function(x){
z <- (x > 0) * (x < 1) * (cos(50 * x) + sin(20 * x))
z^2
}
qplot(x = xs, y = h(xs), geom = "line")
```
$h$ can be integrated analytically. Expanding
\[h(x) = \cos(50x)^2 + 2 \cos(50x) \sin(20x) + \sin(20x)^2,\] the antiderivative in $[0, 1]$ is
\[H(x) = \frac{x}{2} + \frac{\sin(100x)}{200} + 2 \left[ \frac{\cos(30x)}{60} - \frac{\cos(70x)}{140} \right] + \frac{x}{2} - \frac{\sin(40x)}{80} \]
and integrated over $[0, 1]$ gives
```{r integrate, dependson = "defh"}
H <- function(x) {
x/2 + sin(100 * x)/200 + 2 * (cos(30 * x)/60 - cos(70 * x)/140) + x/2 - sin(40 * x)/80
}
trueValue <- H(1) - H(0)
cat("The true value of I is", trueValue)
```
### Uniform Proposal
First, rejection sampling with uniform proposal
amounts here to crude Monte Carlo here, as the integral
is with respect to the uniform measure on $[0,1]$.
```{r rejection, dependson = "defh"}
numRepeats <- 1000
numPoints <- 1000
mcErrors <- rep(NA, numRepeats)
for(i in 1:numRepeats){
u <- runif(n=numPoints)
weights <- h(u)
mcErrors[i] <- abs(mean(weights) - trueValue)/trueValue
}
qplot(x = mcErrors, y = ..density.., geom = "histogram", binwidth = 0.01) + xlab("Relative Error")
```
### Gaussian Mixture Proposal
From our plot of $h$ we realize it is multimodal. We expect that we should be able to
reduce the variance of the Monte Carlo estimator by using importance sampling (IS)
with a Gaussian mixture proposal. We roughly place some mixture components
by eye here, but feel free to try to automate this procedure using an
optimizer, for example.
```{r mixtureestimation}
numComponents <- 13
omega <- c(1, 2, 2, 1, 4, 0.2, 0.2, 4, 1, 2, 2, 1, 2)
omega <- omega/sum(omega)
mu <- c(.01, .12, .195, .305, .38, 0.45, 0.495, .565, .635, .745, .825, .935, 1.0)
sigma <- c(.02, .02, .02, .02, .02, 0.01, 0.01, .02, .02, .02, .02, .02, .02)
# Use a builtin library for sampling from normal mixtures.
library(nor1mix)
mixture <- norMix(m = mu, sigma = sigma, w = omega)
```
Let's visually compare the target function with our importance sampling density.
```{r plotting}
normalized_hs <- abs(h(xs)) / sum(abs(h(xs))) * 500 / 1.4
df <- data.frame(xs = xs, hs = abs(h(xs)), ahs = normalized_hs, qs = dnorMix(xs, mixture))
ggplot(df) +
geom_line(aes(x = xs, y = ahs), colour = "black") +
geom_line(aes(x = xs, y = qs), colour = "red")
```
```{r importance_weights}
qplot(x = xs, y = h(xs)/dnorMix(xs, mixture), col="blue")
```
```{r gaussian_experiment, dependson = "rejection"}
isErrors <- rep(NA, numRepeats)
for(i in 1:numRepeats){
x <- rnorMix(numPoints, mixture)
weights <- h(x)/dnorMix(x, mixture)
isErrors[i] <- abs(mean(weights) - trueValue)/trueValue
}
qplot(x = isErrors, y = ..density.., geom = "histogram", binwidth = 0.01) + xlab("Relative Error")
```
As expected, IS leads to smaller relative errors
```{r boxplot, dependson = "gaussian_experiment"}
boxplot(mcErrors, isErrors, names = c("Rejection Sampling", "Importance Sampling"))
```