### Simulation and Statistical Programming Hilary Term 2020

Page last updated: 2:20pm April 27, 2020

### COVID-19 UPDATE

The information given in this section supercedes that given in the rest of the website. The exam for this course is postponed indefinitely. Please follow Departmental webpages for more details about the exam. Ordinarily for this course, in Trinity Term (TT), there would be one problem class in TT week 1, one revision class in each of TT weeks 2 and 3, and one consultation class in each of TT weeks 4 and 5. These will now be handled as follows.
• Final problem set: Due Monday noon Oxford time TT week 1 to your class TA by email. They will return marks by Wednesday noon Oxford time TT week 1 to you by email. Solutions are here for are the first two questions and here for the last three questions . Please do not share these with subsequent years, they are unusually being released due to the exceptional circumstances of the SARS-CoV-2 pandemic
• Final problem class (intercollegiate class): Pre-recorded and available on Canvas (link, under the Recordings folder). If you have questions about the material, you are encouraged to email your class tutors.
• Revision classes: Pre-recorded and available on Canvas, covering the collections material, as well as the 2019 exam(link, under the Recordings folder).
• Consultation classes: The form and date of these is to be determined. More details will follow nearer to when the exam is to be held.

### Synopsis

• Simulation: Transformation methods. Rejection sampling including proof for a scalar random variable, Importance Sampling. Unbiased and consistent IS estimators. MCMC including the Metropolis-Hastings algorithm.
• Statistical Programming: Numbers, strings, vectors, matrices, data frames and lists, and Boolean variables in R. Calling functions. Input and Output. Writing functions and flow control. Scope. Recursion. Runtime as a function of input size. Solving systems of linear equations, Cholesky decomposition. Numerical stability. Regression and least squares, QR factorisation. Implementation of Monte Carlo methods for elementary Bayesian inference.
• See full sypnosis on departmental website here and handbook supplement pdf here.

### Timetable

 Week LectureTuesday 2-3pmLG.02, the IT suite, 24-29 Saint-Giles' Computer LabFriday 9-11amLG.02, the IT suite, 24-29 Saint-Giles' Problem classVarious times 24-29 Saint-Giles' 1 X 2 X 3 X X X 4 X X 5 X X X 6 X X 7 X X X 8 X X TT1 X

### Problem class details and sheets

 Class Tutor TA Time Location weeks HT3, 5, 7 Location week TT1 1 Robbie Davies Sheheryar Zaidi Wednesday 9AM LG.02 LG.02 2 Anthony Caterini Anthony Caterini Thursday 11-12PM LG.02 LG.03 3 Anthony Caterini Sheheryar Zaidi Thursday 10-11AM LG.02 LG.03 4 Bobby He Bobby He Wedesnday 4-5PM LG.04 LG.04

Please hand in the solutions to the problem sheets by Monday noon of weeks 3, 5, 7 and send the R code by email, in a single well-commented R-script to the class TA.

### Simulation lectures

Lecture slides with previous year contents and format are available here, here, and here. I will be minimally modifying these slides for this year. I will post the slides I will use below a few days before class.

### Statistical programming lectures, problem sheets and solutions

 Week Slides Practical Solutions Code Extra 3 Slides 1 Practical 1 Solutions 1 4 Slides 2 Practical 2 Solutions 2 Code 2 5 Slides 3 Practical 3 Solutions 3 Code 3 6 Slides 4 Practical 4 Solutions 4 Code 4 7 Slides 5 Practical 5 Solutions 5 Code 5 8 Slides 6 Practical 6 Solutions 6 Code 6 MHcode.R

Handouts (fewer pages, same information) of the slides are available here, as well as 4 per page here.

Datasets: Cystic Fibrosis (cystfibr.txt), Tetrahymena Data (hellung.txt), Japanese beetle larvae data (beetlelarva.txt), Speed Data (speed.txt), Air Pollution Data (airpol.txt), Image Data (image_noisy.txt, image_true.txt).

### Resources

We will be using the statistical software package R, which you can get here. Please install it on your own computer and practice using it as early as possible. You may find it helpful to bring a laptop to use during lectures, but it is not necessary.

The software RStudio is also useful.

You may find it useful to use R through a text editor. If you can stomach the learning curve of memorizing a few dozen key combinations, I highly recommend Emacs Speaks Statistics. You can download an all in one installer if you use a mac laptop here. The benefit of ESS is that the development environment when you use a laptop or a server is the same. This facilitates development in the real world, where almost always, the data is too sensitive to download locally, or your local workstation / laptop has insufficient storage or power to perform analysis locally.

### Credit

This course material was almost entirely prepared by others who lectured this course earlier, including at least Julien Berestycki, Geoff Nicholls and Robin Evans.