Part C Synopses 20192020
Recommended Prerequisites: A8 Probability, SB3.1 Applied Probability would be helpful.
Course Term: Michaelmas
Number of Lectures: 16
Level: Mlevel
Weight: One Unit
Method of Assessment: Written examination
Aims & Objectives
The aim of the lectures is to introduce modern stochastic models in mathematical population genetics and give examples of real world applications of these models. Stochastic and graph theoretic properties of coalescent and genealogical trees are studied in the first eight lectures. Diffusion processes and extensions to model additional key biological phenomena are studied in the second eight lectures.
Synopsis
Evolutionary models in Mathematical Genetics: The WrightFisher model. The Genealogical Markov chain describing the number ancestors back in time of a collection of DNA sequences.
The Coalescent process describing the stochastic behaviour of the ancestral tree of a collection of DNA sequences. Mutations on ancestral lineages in a coalescent tree. Models with a variable population size.
The frequency spectrum and age of a mutation. Ewens’ sampling formula for the probability distribution of the allele configuration of DNA sequences in a sample in the infinitelymanyalleles model. Hoppe’s urn model for the infinitelymanyalleles model.
The infinitelymanysites model of mutations on DNA sequences. Gene trees as perfect phylogenies describing the mutation history of a sample of DNA sequences. Graph theoretic constructions and characterizations of gene trees from DNA sequence variation. Gusfield’s construction algorithm of a tree from DNA sequences. Examples of gene trees from data.
Modelling biological forces in Population Genetics: Recombination. The effect of recombination on genealogies. Detecting recombination events under the infinitelymanysites model. Hudson’s algorithm. Haplotype bounds on recombination events. Modelling recombination in the WrightFisher model. The coalescent process with recombination: the ancestral recombination graph. Properties of the ancestral recombination graph.
Introduction to diffusion theory. Tracking mutations forward in time in the WrightFisher model. Modelling the frequency of a neutral mutation in the population via a diffusion process limit. The generator of a diffusion process with two allelic types. The probability of fixation of a mutation. Genic selection. Extension of results from neutral to selection case. Behaviour of selected mutations.
Reading
 R. Durrett, Probability Models for DNA Sequence Evolution, Springer, 2008
 A.Etheridge, Some Mathematical Models from Population Genetics. Ecole d’Eté de Probabilities de SaintFlour XXXIX2009, Lecture Notes in Mathematics, 2012
 W. J. Ewens, Mathematical Population Genetics, 2nd Ed, Springer, 2004
 J. R. Norris, Markov Chains, Cambridge University Press, 1999
 M. Slatkin and M. Veuille, Modern Developments in Theoretical Population Genetics, Oxford Biology, 2002
 S. Tavaré and O. Zeitouni, Lectures on Probability Theory and Statistics, Ecole d’Eté de Probabilities de SaintFlour XXXI – 2001, Lecture Notes in Mathematics 1837, Springer, 2004
Recommended Prerequisites: A8 Probability and A9 Statistics
Course Term: Michaelmas
Number of Lectures: 14. For this course, 2 lectures and 2 intercollegiate classes are replaced by 2 practical classes. (The total time for this course is the same as for other Part C courses.)
Level: Mlevel
Weight: One Unit
Method of Assessment: Written examination
Aims and Objectives
Many data come in the form of networks, for example friendship data and proteinprotein interaction data. As the data usually cannot be modelled using simple independence assumptions, their statistical analysis provides many challenges. The course will give an introduction to the main problems and the main statistical techniques used in this field. The techniques are applicable to a wide range of complex problems. The statistical analysis benefits from insights which stem from probabilistic modelling, and the course will combine both aspects.
Synopsis
Exploratory analysis of networks. The need for network summaries. Degree distribution, clustering coefficient, shortest path length. Motifs.
Probabilistic models: Bernoulli random graphs, geometric random graphs, preferential attachment models, small world networks, inhomogeneous random graphs, exponential random graphs.
Small subgraphs: Stein’s method for normal and Poisson approximation. Branching process approximations, threshold behaviour, shortest path between two vertices.
Statistical analysis of networks: Sampling from networks. Parameter estimation for models. Inferring edges in networks. Network comparison. A brief look at community detection.
Reading
 R. Durrett, Random Graph Dynamics, Cambridge University Press,2007
 E.D Kolaczyk and G. Csádi, Statistical Analysis of Network Data with R, Springer, 2014
 M. Newman, Networks: An Introduction. Oxford University Press, 2010
Recommended Prerequisites: The course requires a good level of mathematical maturity. Students are expected to be familiar with core concepts in statistics (regression models, biasvariance tradeoff, Bayesian inference), probability (multivariate distributions, conditioning) and linear algebra (matrixvector operations, eigenvalues and eigenvectors). Previous exposure to machine learning (empirical risk minimisation, dimensionality reduction, overfitting, regularisation) is highly recommended.
Students would also benefit from being familiar with the material covered in the following courses offered in the Statistics department: SB2.1 (formerly SB2a) Foundations of Statistical Inference and in SB2.2 (formerly SB2b) Statistical Machine Learning.
Course Term: Hilary
Number of Lectures: 16
Level: Mlevel
Weight: One Unit
Methods of Assessment: Written examination.
Aims and Objectives
Machine learning is widely used across the sciences, engineering and society, to construct methods for identifying interesting patterns and predicting accurately from large datasets.
This course introduces several widely used machine learning techniques and describes their underpinning statistical principles and properties. The course studies both unsupervised and supervised learning and several advanced and stateoftheart topics are covered in detail. The course will also cover computational considerations of machine learning algorithms and how they can scale to large datasets.
Synopsis
Empirical risk minimisation. Loss functions. Generalization. Over and underfitting. Bias and variance. Regularisation.
Support vector machines. Kernel methods and reproducing kernel Hilbert spaces. Representer theorem. Representation of probabilities in RKHS.
Deep learning: Neural networks. Computation graphs. Automatic differentiation. Stochastic gradient descent.
Probabilistic and Bayesian machine learning: Information theoretic fundamentals. EM algorithm. Variational inference. Mixture models. Probabilistic PCA. Latent Dirichlet allocation.
Collaborative filtering. Probabilistic matrix factorisation.
Deep generative models. Variational autoencoders.
Gaussian processes for regression and classification. Bayesian optimisation.
Software
Knowledge of Python is not required for this course, but some examples may be done in Python. Students interested in learning Python are referred to the following free University IT online courses, which should ideally be taken before the beginning of this course: https://help.it.ox.ac.uk/courses/index
Reading
 C. Bishop, Pattern Recognition and Machine Learning, Springer,2007
 K. Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012
Further Reading
 T. Hastie, R. Tibshirani, J Friedman, Elements of Statistical Learning, Springer, 2009
 Scikitlearn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 28252830, 2011, http://scikitlearn.org/stable/tutorial/
Recommended Prerequisites: The course requires a good level of mathematical maturity as well as some statistical intuition and background knowledge to motivate the course. Students are expected to be familiar with core concepts from probability (conditional probability, conditional densities, properties of conditional expectations, basic inequalities such as Markov’s, Chebyshev’s and CauchySchwarz’s, modes of convergence), basic limit theorems from probability in particular the strong law of large numbers and the central limit theorem, Markov chains, aperiodicity, irreducibility, stationary distributions, reversibility and convergence. Most of these concepts are covered in courses offered in the Statistics department, in particular prelims probability, A8 probability and SB3.1 (formerly SB3a) Applied Probability.
Familiarity with basic Monte Carlo methods will be helpful, as for example covered in A12 Simulation and Statistical Programming.
Some familiarity with concepts from Bayesian inference such as posterior distributions will be useful in order to understand the motivation behind the material of the course.
Course Term: Hilary
Number of Lectures: 16
Level: Mlevel
Weight: One Unit
Methods of Assessment: Written examination
Aims and Objectives
The aim of the lectures is to introduce modern simulation methods.
This course concentrates on Markov chain Monte Carlo (MCMC) methods and Sequential Monte Carlo (SMC) methods. Examples of applications of these methods to complex inference problems will be given.
Synopsis
Classical methods: inversion, rejection, composition.
Importance sampling.
MCMC methods: elements of discretetime general statespace Markov chains theory, MetropolisHastings algorithm.
Advanced MCMC methods: Gibbs sampling, slice sampling, tempering/annealing, Hamiltonian (or Hybrid) Monte Carlo, pseudomarginal MCMC.
Sequential importance sampling.
SMC methods: nonlinear filtering.
Reading
 C.P. Robert and G. Casella, Monte Carlo Statistical Methods, 2nd edition, SpringerVerlag, 2004
Further reading
 J.S. Liu, Monte Carlo Strategies in Scientific Computing, SpringerVerlag, 2001
Recommended Prerequisites: SB2.1 (formerly SB2a) Foundations of Statistical Inference is desirable, of which 6 lectures on Bayesian inference, decision theory and hypothesis testing with loss functions are assumed knowledge. A12 Simulation and Statistical Programming desirable.
Course Term: Hilary
Number of Lectures: 16
Level: Mlevel
Weight: One Unit
Method of Assessment: Written examination
Synopsis
Theory: Decisiontheoretic foundations, Savage axioms. Prior elicitation, exchangeability. Bayesian NonParametric (BNP) methods, the Dirichlet process and the Chinese Restaurant Process. Asymptotics, information criteria and the Bernsteinvon Mises approximation.
Computational methods: Bayesian inference via MCMC; Estimation of marginal likelihood; Approximate Bayesian Computation and intractable likelihoods; reversible jump MCMC.
Case Studies: extend understanding of prior elicitation, BNP methods and asymptotics through a small number of substantial examples. Examples to further illustrate building statistical models, model choice, model averaging and model assessment, and the use of Monte Carlo methods for inference.
Reading
 C.P. Robert,The Bayesian Choice: From DecisionTheoretic Foundations to Computational Implementation, 2nd edition, Springer, 2001
Further Reading
 A. Gelman et al, Bayesian Data Analysis, 3rd edition, Boca Raton Florida: CRC Press, 2014
 P Hoff, A First Course in Bayesian Statistical Methods, Springer, 2010
 DeGroot, Morris H., Optimal Statistical Decisions. Wiley Classics Library. 2004.
Recommended Prerequisites: The lectures attempt to be selfcontained but clearly knowledge algorithms, combinatorics and probability theory – A8 Probability and SB3.1 (formerly SB3a) Applied Probability would be a help. The course requires a good level of mathematical maturity.
Course Term: Hilary
Number of Lectures: 16
Level: Mlevel
Weight: One Unit
Method of Assessment: Miniproject
Aims & Objectives
Modern molecular biology generates large amounts of data, such as sequences, structures and expression data, that needs different forms of statistical analysis and modelling to be properly interpreted. This course focuses on four topics within this vast area: Molecular Dynamics, Molecule Enumeration, Comparative Biology and Overview of Computational Biology and Computational Neurosciences.
Synopsis
Molecular Dynamics – MD is another huge application area that is bound to continue to grow for decades and stochastic methods are central in exploring a huge state space. The lectures would in Hamiltonean Dynamics, The Canonical Distrubution and Stochastic Differential Distributions, The Langevin model for Brownian Motion and Comparison of MD trajectories.
Molecule Combinatorics and ChemoInformatics – This areas is central in drug design and has many statistical problems embedded. Exhaustive enumeration is at present limited to molecules with 18 CNPOS atoms and including one more atom expands the numbers with about 10100 at this point. But there are many other possible avenue such as sampling or exploring a subspace generated by an initial set of molecules and a set of reactions. There are many advanced issues in counting molecules such Polya Counting and imposing constraints making molecular graphs embeddable in 3D.
Computational Neuroscience. Computational Neuroscience is in massive growth, but has a history going back to the 1940s with publications like McCullogh and Pitts (1943) paper on neural networks. The present progress is driven by progress on three fronts: (i) Experimental data on brains, nerve systems and individual neurons, (ii) increased success in designing artificial neural networks with an increasing variety in archtectures with applications in Deep Learning/AI and (iii) the simulate very complex models as models of biological neural networks.
Machine Learning in the Biosciences. Machine Learning have proved very powerful in the Biosciences and presently Deep Learning methods is getting a lot of attention with the latest success being very successful prediction of protein structure. Other success areas are ChemoInformatics and Genomics.
The exact plan for lectures is:
W1 Physics of Molecules: QM and MM
W1 Integrators and Approximations
W2 Protein Folding
W2 History of Molecular Dynamics and its applications
W3 Mathematical Models of Origins of Life
W3 Polya Enumeration
W4 Chemical Space
W4 Chemical Reactions
W5 Biological & Artificial Neurons
W5 Modelling Small Networks
W6 Large Scale Modelling
W6 Statistics of the Brain
W7 Basics of Machine and Deep Learning
W7 Applications in Structural Biology
W8 Applications in ChemoInformatics
W8 Other Applications in the Biosciences
Reading
The teaching material from 2019 would be useful to browse, but the 2019 course will have some changes in syllabus and improvements:
Topics in Computational Biology Course 18
Further Reading
 M. Steel, Phylogeny: Discrete and Random Processes in Evolution, chapt 12, SIAM Press (2003).
 T. Schlick, Molecular Modeling and Simulation. Chapt 1314, Springer (2010).
 M. Meringer “Structure Enumeration and Sampling” chapt. 8 in Handbook in Chemoinformatics Algorithms (eds Faulon) (2010). Chapman and Hall.
 B.C. O’Meara Evolutionary Inferences from Phylogenies: A Review of Methods, Annu. Rev. Ecol. Evol. Syst. 2012. 43:267–85.
Recommended Prerequisites: Discrete and continuous time Markov process on countable state space, as covered for example in A8 Probability and SB3.1 (formerly SB3a) Applied Probability.
Course Term: Michaelmas
Number of Lectures: 16
Level: Mlevel
Weight: One Unit
Method of Assessment: Written examination
Aims and Objectives
The aim is to introduce fundamental probabilistic and combinatorial tools, as well as key models, in the theory of discrete disordered systems. We will examine the largescale behaviour of systems containing many interacting components, subject to some random noise. Models of this type have a wealth of applications in statistical physics, biology and beyond, and we will see several key examples in the course. Many of the tools we will discuss are also of independent theoretical interest, and have far reaching applications. For example, we will study the amount of time it takes for this process to reach the stationary distribution (mixing time). This concept is also important in many statistical applications, such as studying the run time of MCMC methods.
Synopsis
 Percolation and phase transitions.
 Uniform spanning trees, looperased random walks, and Wilson’s algorithm.
 Important models: Stochastic Ising model (Glauber dynamics), Randomcluster model, Contact process, Exclusion process, Hardcore model.
 Important tools: Monotonicity, coupling, duality, FKG inequality.
 Gibbs measures and a discussion of phase transitions in this context.
 Mixing times and the spectral gap.
Reading
 G. Grimmett, Probability on Graphs: random processes on graphs and lattices, Cambridge University Press, 2010; 2017 (2nd edition).
 B. Bollobás, O. Riordan, Percolation, Cambridge University Press, 2006.
 T. Liggett, Continuous time Markov Processes: an introduction, American Mathematical Society, 2010.
 D. A. Levin, Y. Peres, E. L. Wilmer, Markov Chains and Mixing Times, American Mathematical Society, 2009.
 DuminilCopin, Introduction to Bernoulli percolation. These are a set of notes from 2017, which are easily available online.
Recommended Prerequisites: The course requires a good level of mathematical maturity. Students are expected to be familiar with core concepts in probability (basic properties of probabilities, such as union bounds, and of conditional expectations, such as the tower property; basic inequalities, such as Markov’s and Jensen’s), statistics (confidence intervals, hypothesis testing), and linear algebra (matrixvector operations, eigenvalues and eigenvectors; basic inequalities, such as CauchySchwarz’s and Hölder’s). Previous exposure to machine learning (empirical risk minimisation, overfitting, regularisation) is recommended.
Students would benefit from being familiar with the material covered in SB2.1 (formerly SB2a) Foundations of Statistical Inference (in particular, Decision Theory) and in SB2.2 (formerly SB2b) Statistical Machine Learning.
Course Term: Michaelmas
Number of Lectures: 16
Level: Mlevel
Weight: One Unit
Method of Assessment: Written examination
Aims and Objectives
This course is meant to provide a rigorous theoretical account of the main ideas underlying machine learning, and to offer a principled framework to understand the algorithmic paradigms being used, involving tools from probability, statistics, and optimisation.
Synopsis
 Statistical learning frameworks for prediction, estimation and online learning.
 Probability
 Maximal inequalities.
 Rademacher and Gaussian complexities.
 Elements of VC theory.
 Covering and packing numbers.
 Chaining.
 Concentration inequalities.
 Statistics
 Bayes decision rules.
 Empirical risk minimisation. Error decomposition: generalisation, optimisation, and approximation.
 Learning via uniform convergence, margin bounds, and algorithmic stability.
 Regularisation: explicit (constraints and penalisation) and implicit (algorithmic)..
 Convex loss surrogates.
 Slow and fast rates.
 Minimax lower bounds and hypothesis testing.
 Optimisation
 Elements of convex theory.
 Oracle model. Gradient descent. Mirror descent.
 Stochastic oracle model. Stochastic gradient descent. Stochastic mirror descent. Variance reduction techniques.
 Approximate Message Passing.
 Online optimisation.
 Examples
 Linear predictors, including Boosting.
 Nonlinear predictors, including Support Vector Machines and Neural Networks.
 Highdimensional estimators for sparse and lowrank problems, including Lasso.
 Online learning, including multiarmed bandit problems, reinforcement learning and algorithms.
Reading

 Shai ShalevShwartz and Shai BenDavid. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014
 Sébastien Bubeck. Convex Optimization: Algorithms and Complexity. Foundations and Trends in Machine Learning, 2015
 Ramon van Handel. Probability in High Dimension. Lecture notes available online (http://www.princeton.edu/~rvan/APC550.pdf), 2016
 Tor Lattimore and Csaba Szepesvári. Bandit Algorithms. Book available online http://downloads.torlattimore.com/book.pdf), 2018
 Martin J. Wainwright. HighDimensional Statistics. A NonAsymptotic Viewpoint. Cambridge University Press. 2019
Mathematics and Statistics Synopses for Part C 20192020 [PDF]