################################################################## # BEGINNER'S R for Part A SSP # Solutions ################################################################## #Getting started - variables, functions, graphs print('hello world') 1+1 x=3/7 x exp(-x^2) pi cos(pi) #exercises #1. Use R to calculate 1.1 to the power of 100 1.1^100 #2. calculate the sine of pi/2 (using R!) sin(pi/2) #3. pi is defined, but "e" the base of the natural logarithm, is not. Calculate it. e=exp(1) e #4. in R, log(x) is the logarithm function. Is it base 10 or base e? log(e) #equals 1 so this is base e log10(10) #equals 1 so this is base 10 #here is a simple R function to plot functions curve(exp(-x^2),from=-3,to=3,col=2) #add horizontal and vertical lines for axes abline(h=0) abline(v=0) #add a legend explaining which line is which legend('topright','exp(-x^2)',lty=1,col=2) #exercises #5. plot log10(x) from 0.1 to 10 curve(log10(x), from=0.1, to=10) #6. plot log(x) from 1/e to e curve(log(x), from=1/e, to=e) #7. plot cos(x^2) from 0 to sqrt(4*pi) plot(cos(x^2), from=0, to=sqrt(4*pi)) #################################################################### #vectors #we can collect things together in a vector #and take functions of all the numbers at once theta=c(0,pi/2,pi) theta sin(theta) #functions to make special vectors n=-4:4 n #to access elements of a vector use [] n[1] n[1:3] m=rep(1,9) m m[1:3]=0 m m[7:9]=m[7:9]-1 m #element by element multiplication, and addition n*2 n*n m+n #some functions on vectors length(n) sum(n) max(n) min(n) #regular sequences (useful for plotting) x=seq(from=-4,to=4,by=2) x x=seq(from=0,to=1,length.out=11) x #exercises #8. use the methods above to make a vector x=c(0,pi/2,pi,3*pi/2,...,8*pi) #dont just type in x=c(0,pi/2,pi,3*pi/2, etc etc )! Use seq(). x=pi*seq(from=0, to=8, by=0.5) x #9. calculate cos(a*pi) for a=0,0.5*pi,pi,...,8*pi cos(x) #10. create a vector x of 101 equally spaced values from 0 to 2 x=seq(from=0, to=8, length.out=101) #11. create a vector y=x^2 for x=0...1 and y=x^2+1 for x=1.02 to 2 (hint x[51] should equal 1 so x[52:101]...) y=x^2 y[52:101]=y[52:101]+1 #12. enter the command "plot(x,y)" (without the quotes) plot(x,y) #13. create a vector 1,1/4,1/9,1/16...,1/10000 and sum the values (hint start with 1:100, square, invert and sum) #(you should find the answer is about pi^2/6) sum(1/(1:100)^2) pi^2/6 #can try sum(1/(1:10000)^2) etc and check converges ################################################################ #matrices #When we make a matrix R fulls the columns from left to right with the vector of values M=matrix(1:9,3,3) M #to access an entry M[3,3] #or a group of entries M[1:2,1:2] #to access the first row M[1,] #or the third column M[,3] #matrices have dimensions d=dim(M) d #number of rows and columns d[1] d[2] #we can apply functions to matrices, just like vectors L=matrix(c(-1,0,1,1,-1,0,0,1,-1),3,3) L L+M M+1 2*L+M cos(L*pi/2) round(cos(L*pi/2),2) #rounded to 2 decimal places - easier to read #exercises #14. make this matrix # [,1] [,2] [,3] #[1,] -4 -1 2 #[2,] -3 0 3 #[3,] -2 1 4 V=matrix(-4:4,3,3) V #15. This matrix is a transition probability matrix for a markov chain # [,1] [,2] [,3] #[1,] 0.1 0.4 A #[2,] 0.2 0.5 B #[3,] 0.3 0.6 C #Work out A,B and C and create the matrix. Call it P #A=0.5, B=0.3, C=0.1 so rows sum to 1. P=matrix(c(0.1,0.2,0.3,0.4,0.5,0.6,0.5,0.3,0.1),3,3) #"sum" is a function that sums all the elements in a matrix sum(M) #We often want to just sum along the rows or down the columns of a matrix #apply(Matrix,row/col,function) applies a function to the rows/cols of Matrix M #direction 2 is the column direction, this sums down the columns apply(M,2,sum) #direction 1 is the row direction, this sums along the rows apply(M,1,sum) #exercise #16. check that M and M+L have the same row and colum sums #row sums apply(M,1,sum) apply(M+L,1,sum) #column sums apply(M,2,sum) apply(M,2,sum) #16b. check row sums of P are 1. apply(P,1,sum) #we can multiply matrices L=matrix(c(-1,0,1,1,-1,0,0,1,-1),3,3) L L%*%M #notice that L*M is not the same thing! L*M #same with taking powers L*L L^2 #are both just squaring all the elements whereas L%*%L #is the matrix multiplication square. #diag() is a useful function. We can use it to make an identity matrix, #and also to extract and change diagonal elements all at once diag(1,c(3,3)) M d=diag(M) d diag(M)=0 M diag(M)=d M #exercises #17. Does matrix multiplication commute? Evaluate L%*%M and M%*%L - are they equal? #nope L%*%M M%*%L #18. Work out P%*%P, P%*%P%*%P etc out to P^n with n=8 (that's 8 matrix multiplications) #where P=matrix(c(0.1,0.2,0.3,0.4,0.5,0.6,0.5,0.3,0.1),3,3) is the transition matrix from Question 15. #When n is large the resulting matrix should have 3 near-identical rows. Why is that? P=matrix(c(0.1,0.2,0.3,0.4,0.5,0.6,0.5,0.3,0.1),3,3) #here is P^8 P%*%P%*%P%*%P%*%P%*%P%*%P%*%P #Each row gives the probabilities to be in states 1,2 and 3 after many steps. #The chain reaches an equilibrium distribution, so the probability to be in #any particular state after many steps doesnt depend (much) on where the chain started. #Hence, the rows should be similar. If we compute P^n at very large n the rows #will be equal to the precision of the computer. ################################################################## #for-loops #handy when we want to carry out the same operation many times on a list of things. for (k in 1:10) { print('hello world') } tot=0 for (j in 1:10) { tot=tot+j } tot #That last one starts with tot=0 and accumulates 1+2+3+...+10 #same as sum(1:10) (!) #Exercise #19a Write a for-loop that calculates the product of the first 7 primes #x=c(2,3,5,7,11,13,17) x=c(2,3,5,7,11,13,17) tot=1 for (j in x) { tot=tot*j } tot #OR tot=1 for (j in 1:7) { tot=tot*x[j] } tot #OR - best - just prod(x) #we can use the loop to calculate P%*%P...P%*%P without alot of typing Q=P n=8 for (i in 2:n) { Q=Q%*%P } Q #exercise #19b. Let P be the transition matrix for the Sunny/Rainy chain from yesterday # [,1] [,2] #[1,] 0.9 0.1 #[2,] 0.5 0.5 #State one is sunny state 2 is rainy. #Calculate P^100 (matrix multiplication 100 times). What are the approximate #long-term chances of it being sunny? P=matrix(c(0.9,0.5,0.1,0.5),2,2) Q=P n=100 for (i in 2:n) { Q=Q%*%P } Q # long-term chances of sunshine are 0.83(3) or 5/6, the entries in the first column, #which give the probabilities to have sunshine on the 100th day - these are near-equal #as the weather on day 100 isnt much affected by the weather on day 1. ############################################################################ #functions #if we have a calculation we need to do often (for example, taking matrix powers) #we can make our own function to save typing it all out each time. #first a very simple example to give the idea f=function(x) { d=x^2 return(d) } f(3) f(-3) #this function will act on vectors y=-3:3 y f(y) #and matrices M=matrix(c(1,2,3,4,5,6,7,8,9),3,3) f(M) #but notice that this function just does element-by-element squaring. #our function behaves like an R function curve(f(x),from=-3,to=3,main='y=x^2') #exercises #20. Change the function f() so that it calculates cos(x). Give your function a new name. my.cos=function(x) { d=cos(x) return(d) } my.cos(pi) #21. suppose x is a vector, say (x_1,x_2,x_3). The euclidean norm of x is sqrt(x_1^2+x_2^2+x_3^2). #For example, if x=c(1,0,2) the norm is sqrt(5) and in R this would be sqrt(sum(x^2))). #Write an R function that takes as input a vector x and calculates and returns the norm. my.norm=function(x) { return(sqrt(sum(x^2))) } my.norm(c(1,0,2)) sqrt(5) #functions can have more than one argument #this function lets us input the power - and defaults to 2 if we #omit the 2nd argument g=function(x,p=2) { d=x^p return(d) } g(3,3) #same thing but easier to read g(x=3,p=3) #R uses the default if we omit the 2nd argument g(3) #exercises #22. change the function g() so that it calculates cos(p*x) defaulting to p=1. #Give your function a new name. my.cosp=function(x,p=1) { d=cos(p*x) return(d) } my.cosp(pi) my.cosp(pi,p=2) #23. suppose x is a vector, say (x_1,x_2,x_3). The p-norm of x is (|x_1|^p+|x_2|^p+|x_3|^p))^(1/p) #where |x| is the absolute value of x. For example, if x=c(1,0,2) the 3-norm is (1+0+8)^(1/3) ~ 2.08. #In R this would be (sum(abs(x)^p))^(1/p). Write an R function that takes as input a vector x #and a value for p and calculates and returns the p-norm of x. p should default to 2. my.pnorm=function(x,p=2) { pnorm=( sum( abs(x)^p ) )^(1/p) return(pnorm) } my.pnorm(c(1,0,2)) my.pnorm(c(1,0,2),p=3) #Now for something a bit more complicated: a function to calculate positive #integer matrix powers my.matrix.power=function(M,p) { #calculate M^p (matrix-power) Q=diag(1,dim(M)) for (i in 1:p) { Q=Q%*%M } return(Q) } my.matrix.power(P,1) my.matrix.power(P,2) #exercises #24. Use the function to calculate P^100 (matrix multiplication) #for P the transition matrix of the sunny rainy Markov chain. P=matrix(c(0.9,0.5,0.1,0.5),2,2) my.matrix.power(P,100) ################################################################ #comparisons and boolean variables #TRUE (corresponding to 1) and FALSE (corresponding to 0) are #boolean variables recognised by R. We can make comparisons #with TRUE/FALSE outcomes x=1; y=4; z=2^2; x==y y==z y=y)} x=z if (x=y")} #here is another example. %% is the remainder function #so 10%%2 is 0 and 10%%3 is 1 (since 10=3*3+1 remainder). #sort numbers into odd and even for (k in 1:10) { if (k%%2==0) print(paste(k,'is even')) } #25. Write a function is.even() that takes as input a number and #returns TRUE if the number is even and FALSE if it is odd. Here is a start #is.even=function(x) { # if ??? # return(???) #} is.even=function(x) { if (x%%2==0) {result=TRUE} else {result=FALSE} return(result) } #OR is.even=function(x) { return(x%%2==0) } #try your function out - here is mine #> is.even(10) #[1] TRUE #> is.even(9) #[1] FALSE #These operations work on vectors x=-4:4 x>0 #going back to exercise 10 x=seq(from=0,to=2,length.out=101) y=x^2+(x>1) plot(x,y,type='l') #this works because x>1 is FALSE (ie 0) for x<=1 and TRUE (ie 1) for x>=1 #26. f(x) is sin(x) for x<=pi/2 and sin^4(x) for x>pi/2 - plot x from 0 to pi - here is a start #x=seq(from=0,to=pi,length.out=101) #y=WHAT.FUNCTION.GOES.HERE?*(x<=pi/2)+sin(x)^4*(WHAT.TEST.GOES.HERE) #plot(x,y,type='l') x=seq(from=0,to=pi,length.out=101) y=sin(x)*(x<=pi/2)+sin(x)^4*(x>pi/2) plot(x,y,type='l') ## #Advanced topic - rounding errors #A warning! Computers have 'rounding errors'. They dont always represent #fractions and large numbers exactly. Here is an example... #TRUE or FALSE? 0.3==0.1+0.1+0.1 sprintf('%1.60f',0.3) sprintf('%1.60f',0.1+0.1+0.1) ## #Advanced topic - recursion #Functions can call themselves! #For example the factorial function - if f(n)=n! then f(1)=1 and f(n)=n*f(n-1). my.factorial=function(n) { if (n==1) return(1) return(n*my.factorial(n-1)) } my.factorial(3) ################################################################################# #random numbers #R has lots of functions to generate 'random' or 'pseudo-random' numbers #for example sample() samples numbers at random from a list x=-4:4 #sampling without replacement sample(x,6) #sampling with replacement sample(x,6,replace=TRUE) #we can also sample with non-equal probabilities. For example the long-run #probability for a sunny day is 5/6 and 1/6 for rainy. #here are 20 independent random days worth of weather sample(c(1,2),size=20,prob=c(5/6,1/6),replace=TRUE) #(you should see about three 2's and about 17 ones - but it is random after all) #27. Use sample to simulate a die throw sample(1:6,1) #28. Use sample to simulate the weather tomorrow, if today is sunny - the transition matrix was # [,1] [,2] #[1,] 0.9 0.1 #[2,] 0.5 0.5 #sample just a single day sample(1:2,size=1,prob=c(0.9,0.1)) #we use uniform random numbers alot in simulation work #one random number from 0-1 runif(1) #one random number from 9-11 runif(1,9,11) #eight random numbers from 0-1 runif(10) #normal (or gaussian) random numbers are often useful #here are eight normal random variables, each mean 0 and standard deviation 1 rnorm(8) #we can change the mean and variance - for example, mean 10 standard deviation 3 rnorm(8,10,3) #how can we check they are normal? Try generating 1000 of them x=rnorm(1000) x #we can see the numbers on a plot plot(1:1000,x) #plotting a histogram we can see the bell-curve shape hist(x) #the sample mean should be about 0 and the sample std. dev should be about 1 mean(x) sd(x) #what is the probability that x>1 or less than -1 - about 30% right? #we can estimate that using the proportion of samples that fall outside +/- 1 x>1 sum(x>1) tot=sum(x>1)+sum(x<(-1)) av=tot/1000 av #or more simply mean(abs(x)>1) #remark - need more simple exercises here #exercises #29. the ratio of two normal random numbers rnorm(1)/rnorm(1) #has a special distribution called a t(1)-distribution (with one #degree of freedom). Simulate 1000 t-distributed numbers #using rnorm(1000)/rnorm(1000). Make a histogram of your numbers. x=rnorm(1000)/rnorm(1000) hist(x,100) #hard to interpret as the largest are very large indeed #30. Estimate the probability that a t(1) random number is bigger #than one or less than -1. mean(abs(x)>1) #about 0.5 - more 'spread out' than normal dbn which has about 1/3 outside +/-1 #################################################################### #Random walks #Finally, we will plot some random walks in different colors #a random walk is a sequnce of numbers, at each step we add a random jump. #We will use normal random jumps, mean zero, std dev 1. At each step the #next X-value is equal the last X-value plus a normal random number. n=1000 X=rep(0,n) for (i in 2:n) { X[i]=X[i-1]+rnorm(1) } plot(1:n,X,type='l',ylim=c(-3*sqrt(n),3*sqrt(n))) for (k in 1:10) { for (i in 2:n) { X[i]=X[i-1]+rnorm(1) } lines(1:n,X,type='l',col=k) } #exercise #31. Replace X[i]=X[i-1]+rnorm(1) with X[i]=X[i-1]+rnorm(1)/rnorm(1) (the t-dbn) #and see what happens! You may need to replace #ylim=c(-3*sqrt(n),3*sqrt(n)) with ylim=c(-3*n,3*n) or just remove ylim completely n=1000 X=rep(0,n) for (i in 2:n) { X[i]=X[i-1]+rnorm(1)/rnorm(1) } plot(1:n,X,type='l',ylim=c(-3*n,3*n)) for (k in 1:10) { for (i in 2:n) { X[i]=X[i-1]+rnorm(1)/rnorm(1) } lines(1:n,X,type='l',col=k) }