Statistical Machine Learning

Instructor: Prof. Pier Francesco Palamara, University of Oxford, Hilary term 2018

Course offered to Part B students (SB2b) and MSc students (SM4)

New: Revisions (Trinity Term)

You can access past exam questions via OXAM, which is on WebLearn [solutions here]. Note that the syllabus of this course is different from previous editions (e.g. it used to be a Part C course). Past exam questions for revision include:
  • Part B 2017 Q1, Q2 (excluding c), Q3
  • Part C 2016 Q1 (a-b), Q2 (a-b)
  • Part C 2015 Q1, Q2 (a-c), Q3 (b)
  • MSc 2017 Q7, Q8 (excluding b-c)
  • MSc 2016 Q6 (a-b), Q7
  • MSc 2014 Q7
  • MSc 2012 Q6, Q7

Revision Class 1 Week 2, Wed 10:30am-11:30am, LG.01 (with Kaspar Märtens).
Tentative plan: 2015 PartC Q2.a-c; 2016 PartC Q1.a; 2017 MSc
Consultation Session 1
(Part B students only)
Week 2, Wed 10:30am-11:30am, LG.01 (with Pier Palamara).
To make the consultation sessions more efficient, please send your questions by email beforehand.
Pier's email: <lastname>
Revision Class 2 Week 4, Thu 09:00am-10:00am, LG.01 (with Pier Palamara).
Tentative plan: a subset of { 2016 PartC Q1.b or 2014 MSc Q7.c; 2015 PartC Q1.c; 2016 MSc Q7.b-c; 2017 MSc Q8.d }.
Consultation Session 2
(Part B students only)
Week 5, Wed 09:00am-10:30am, LG.04 (with Kaspar Märtens).
To make the consultation sessions more efficient, please send your questions by email beforehand.
Kaspar's email: <firstname>

General Information

Course Team Pier Palamara (course instructor and tutor), Kaspar Märtens (tutor), Juba Nait Saada (TA), Brian Zhang (TA), and Fergus Boyles (TA)
Lectures HT Weeks 1-8, Statistics room LG.01, Wed 12:00-13:00, Fri 10:00-11:00.
Tutorials HT weeks 3, 5, 7, TT Week 1 (details below)
Practicals MSc only: HT Week 4, Friday 11am-1pm (unassessed) and HT Week 8, Friday 11am-1pm (assessed, groups of 4)


Group Time Location Class Tutor / Teaching Assistant
Part B group 1 Mon 9:00am-10:30am, HT weeks 3, 5, 7, TT Week 1 LG.04 Pier Palamara / Fergus Boyles
Part B group 2 Mon 11:00am-12:30pm, HT weeks 3, 5, 7, TT Week 1 LG.04 Pier Palamara / Juba Nait Saada
Part B group 3 Mon 1:00pm-3:30pm, HT weeks 3, 5, 7, TT Week 1 LG.04 Pier Palamara / Juba Nait Saada
Part B group 4 Tue 10:15am-11:45am, HT weeks 3, 5, 7, TT Week 1 LG.05 Kaspar Märtens / Fergus Boyles
Part B group 5 Tue 1:15am-2:45am, HT weeks 3, 5, 7, TT Week 1 LG.04 Kaspar Märtens / Brian Zhang
MSc (single group) Mon 4:00pm-5:00pm, HT weeks 3, 5, 7, TT Week 1 LG.01 Pier Palamara


Recommended prerequisites:
Part A A9 Statistics and A8 Probability. SB2a Foundations of Statistical Inference useful by not essential.

Aims and Objectives:
Machine learning studies methods that can automatically detect patterns in data, and then use these patterns to predict future data or other outcomes of interest. It is widely used across many scientific and engineering disciplines. This course covers statistical fundamentals of machine learning, with a focus on supervised learning and empirical risk minimisation. Both generative and discriminative learning frameworks are discussed and a variety of widely used classification algorithms are overviewed.

Visualisation and dimensionality reduction: principal components analysis, biplots and singular value decomposition. Multidimensional scaling. K-means clustering. Introduction to supervised learning. Evaluating learning methods with training/test sets. Bias/variance trade-off, generalisation and overfitting. Cross-validation. Regularisation. Performance measures, ROC curves. K-nearest neighbours as an example classifier. Linear models for classification. Discriminant analysis. Logistic regression. Generative vs Discriminative learning. Naive Bayes models. Decision trees, bagging, random forests, boosting. Neural networks and deep learning.

C. Bishop, Pattern Recognition and Machine Learning, Springer, 2007.
T. Hastie, R. Tibshirani, J Friedman, Elements of Statistical Learning, Springer, 2009.
K. Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012.

Further Reading
B. D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, 1996.
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning, Springer, 2013.


1 Introduction, unsupervised learning: exploratory data analysis Slides »
2 Principal component analysis, SVD Slides »
3 Biplots, multidimensional scaling, Isomap Slides »
4 Clustering (K-means, hierarchical) Slides »
5 Supervised learning: empirical risk minimization, regression (linear, polynomial), overfitting Slides »
6 Generalization, cross-validation, bias-variance trade-off Slides »
7 Regularization (demo, source, data), gradient descent, classification: the Bayes classifier Slides »
8 Classification: linear and quadratic discriminant analysis (LDA, QDA), FDA Slides »
9 QDA (cont.), logistic regression Slides »
10 Logistic regression (cont.), generative vs. discriminative learning, na├»ve Bayes Slides »
11 Naïve Bayes (cont.), K-nearest neighbors, evaluating performance, ROC curves. Slides »
12 Decision trees Slides »
13 Neural networks Slides »
14 Deep learning (guest lecture by Yee Whye Teh, Oxford/DeepMind) Slides »
15 Bagging, random forests Slides »
16 Boosting Slides »

Problem Sheets

For undergraduate students: hand in solutions in the pigeon holes labeled with your group number by Thursday of the previous week at noon.

Class allocation details are on Minerva (accessible from Oxford network).

1 due at noon, January 25th Sheet » Solutions »
2 due at noon, February 8th Sheet » Solutions »
3 due at noon, February 22nd Sheet » Solutions »
4 due at noon, April 19th Sheet » Solutions »
MSc unassessed and assessed practical (and solutions) has been posted on Weblearn.

Resources for Python

Resources for R

Background Review Aids