Preliminaries

These notes will be updated as the course goes on. If you find any mistakes or omissions, I’d be very grateful to be informed.

Administration

The course webpage and the Canvas page contain problem sheets, slides and links to any other materials (including videos of the lectures).

Problem Sheets and Classes

There will be four problem sheets, and four associated classes.

Part C and OMMS students should sign-up for classes via the online system.

Resources

Books are useful, though not required. Here are the main ones this course is based on.

  1. S.L. Lauritzen, Graphical Models, Oxford University Press, 1996.

    The ‘bible’ of graphical models, and much of the first half of this course is based on this. One complication is that the book makes a distinction between two different types of vertex; this can make some ideas look more complicated than they actually are.

  2. M.J. Wainwright and M.I. Jordan, Graphical Models, Exponential Families, and Variational Inference, Foundations and Trends in Machine Learning, 2008.

    Relevant for the later part of the course, and for understanding much of the computational advantages of graphical models. Available for free here.

  3. J. Pearl, Causality, third edition, Cambridge, 2013.

    Book dealing with the causal interpretation of directed models, which we will see in Chapter 8.

  4. D. Koller and N. Friedman, Probabilistic Graphical Models: Principles and Techniques, MIT Press, 2009.

    A complementary book, written from a machine learning perspective.

  5. A. Agresti Categorical Data Analysis, 3rd Edition, John Wiley & Sons, 2013.

    As the name suggests, covers most of the material we will use for discussing contingency tables and log-linear models, as well as some data examples. Available for free here.

Aims and Objectives

This course will give an overview of the use of graphical models as a tool for statistical inference. Graphical models relate the structure of a graph to the structure of a multivariate probability distribution, usually via conditional independence constraints. This has two broad uses: first, conditional independence can provide vast savings in computational effort, both in terms of the representation of large multivariate models and in performing inference with them; this makes graphical models very popular for dealing with big data problems. Second, conditional independence can be used as a tool to discover hidden structure in data, such as that relating to the direction of causality or to unobserved processes. As such, graphical models are widely used in genetics, medicine, epidemiology, statistical physics, economics, the social sciences and elsewhere.

Students will develop an understanding of the use of conditional independence and graphical structures for dealing with multivariate statistical models. They will appreciate how this is applied to causal modelling, and to computation in large-scale statistical problems.