Statistical Machine Learning

Instructor: Prof. Pier Francesco Palamara, University of Oxford, Hilary term 2019

Course offered to Part B students (SB2b) and MSc students (SM4)

New: Revisions (Trinity Term)

You can access past exam questions via OXAM, which is on WebLearn [solutions here]. Note that the syllabus of this course is different from previous editions (e.g. it used to be a Part C course). Past exam questions for revision include:

Part B 2017 Q1, Q2 (excluding c), Q3
Part C 2016 Q1 (a-b), Q2 (a-b)
Part C 2015 Q1, Q2 (a-c), Q3 (b)
MSc 2017 Q7, Q8 (excluding b-c)
MSc 2016 Q6 (a-b), Q7
MSc 2014 Q7
MSc 2012 Q6, Q7
MSc and Part B 2018

Revision Class 1	Week 2, Wed 9:00am-10:00am, LG.01 (with Pier Palamara). Tentative plan: 2018 Part B exam. More problems from above list if time allows.
Consultation Session 1 (Part B students only)	Week 3, Tue 3:00pm-4:00pm, LG.04 (with Anthony Caterini). To make the consultation sessions more efficient, please send your questions by email beforehand. (see email in General Info).
Revision Class 2	Week 4, Tue 10:00am-11:00am, LG.01 (with Brian Zhang). Tentative plan: 2015 Part C Q1, Q2 (a-c), Q3 (b), selected parts of 2017 MSc Q7.
Consultation Session 2 (Part B students only)	Week 5, Tue 10:00am-11:00am, LG.04 (with Brian Zhang). To make the consultation sessions more efficient, please send your questions by email beforehand. (see email in General Info).

General Information

Course Team	Pier Palamara (course instructor and tutor), Brian Zhang (tutor), Anthony Caterini (tutor), Juba Nait Saada (TA), Robert Hu (TA), Emilien Dupont (TA), and Fergus Boyles (TA)
Email	Instructor: Tutors: TAs:
Lectures	HT Weeks 1-8, Statistics room LG.01, Wed 10am-11am, Fri 9am-10am.
Tutorials	HT weeks 3, 5, 7/8, TT Week 1/2 (details below)
Practicals	MSc only: HT Week 4, Friday 11am-1pm (unassessed) and HT Week 8, Friday 11am-1pm (assessed, groups of 4)

Tutorials

Please contact to change group or any other matter related to class allocation.

Group	Time (location)	Class Tutor / TA
SB2B group 1	Mon 9am-10:30am, HT weeks 3 (LG.04), 5 (LG.04), 7 (LG.04), TT Week 1 (LG.04)	Pier Palamara / Emilien Dupont
SB2B group 2	Mon 11am-12:30pm, HT weeks 3 (LG.04), 5 (LG.04), 7 (LG.04), TT Week 1 (LG.04)	Pier Palamara / Emilien Dupont
SB2B group 3	Mon 1pm-2:30pm, HT weeks 3 (LG.04), 5 (LG.04), 7 (LG.04), TT Week 1 (LG.04)	Pier Palamara / Juba Nait Saada
SB2B group 4	Tue 1pm-2:30pm, HT weeks 3 (LG.04), 5 (LG.05), 8 (LG.05), TT Week 2 (LG.04)	Brian Zhang / Robert Hu
SB2B group 5	Tue 3pm-4:30pm, HT weeks 3 (LG.04), 5 (LG.05), 8 (LG.05), TT Week 2 (LG.04)	Brian Zhang / Robert Hu
SB2B group 6	Tue 1:30pm-3pm, HT weeks 3 (LG05), 7 (LG.05) HT week 5 only: Friday 1:30pm-3pm (LG.05) TT week 1: Friday 10:30am-12pm (LG.05)	Anthony Caterini / Fergus Boyles
SM4 (single group)	Mon 3:00pm-4:00pm, HT weeks 3 (LG.01), 5 (LG.01), 7 (LG.01), TT Week 1 (LG.01)	Pier Palamara

Syllabus

Recommended prerequisites:
Part A A9 Statistics and A8 Probability. SB2a Foundations of Statistical Inference useful by not essential.

Aims and Objectives:
Machine learning studies methods that can automatically detect patterns in data, and then use these patterns to predict future data or other outcomes of interest. It is widely used across many scientific and engineering disciplines. This course covers statistical fundamentals of machine learning, with a focus on supervised learning and empirical risk minimisation. Both generative and discriminative learning frameworks are discussed and a variety of widely used classification algorithms are overviewed.

Synopsis
Visualisation and dimensionality reduction: principal components analysis, biplots and singular value decomposition. Multidimensional scaling. K-means clustering. Introduction to supervised learning. Evaluating learning methods with training/test sets. Bias/variance trade-off, generalisation and overfitting. Cross-validation. Regularisation. Performance measures, ROC curves. K-nearest neighbours as an example classifier. Linear models for classification. Discriminant analysis. Logistic regression. Generative vs Discriminative learning. Naive Bayes models. Decision trees, bagging, random forests, boosting. Neural networks and deep learning.

Reading
C. Bishop, Pattern Recognition and Machine Learning, Springer, 2007.
T. Hastie, R. Tibshirani, J Friedman, Elements of Statistical Learning, Springer, 2009.
K. Murphy, Machine Learning: a Probabilistic Perspective, MIT Press, 2012.

Further Reading
B. D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, 1996.
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning, Springer, 2013.

Slides

Slides will be made available as the course progresses and periodically updated. Please check back for updates.
You can use the material from last year (here) to get a sense of future lectures.
Please email me to report any typos or errors.

1	Introduction, unsupervised learning: exploratory data analysis	Slides »
2	Principal component analysis	Slides »
3	SVD, Biplots, multidimensional scaling, Isomap	Slides »
4	Clustering (K-means, hierarchical)	Slides »
5	Supervised learning: empirical risk minimization, regression (linear, polynomial), overfitting	Slides »
6	Generalization, cross-validation, bias-variance trade-off	Slides »
7	Regularization (demo, source, data), gradient descent, classification: the Bayes classifier	Slides »
8	Classification: linear and quadratic discriminant analysis (LDA, FDA)	Slides »
9	QDA, logistic regression	Slides »
10	Logistic regression (cont.), generative vs. discriminative learning, naïve Bayes	Slides »
11	Naïve Bayes (cont.), K-nearest neighbors, evaluating performance, ROC curves	Slides »
12	Decision trees	Slides »
13	Bagging, random forests	Slides »
14	Neural networks	Slides »
15	Deep learning (automatic differentiation, convnets, RNNs, attention, tools) Guest lecture by Yee Whye Teh, Oxford/DeepMind	Slides »
16	Boosting	Slides »

Problem Sheets

For undergraduate students: hand in solutions in the pigeon holes labeled with your group number.
Class allocation details are on Minerva (accessible from Oxford network).

1	due at noon, January 24th	Sheet »	Solutions »
2	due at noon, February 7th	Sheet »	Solutions »
3	due at noon, February 21st	Sheet »	Solutions »
4	due at noon, April 25th	Sheet »	Solutions »

Resources for Python

Google's Python Class

Resources for R

Installation
RStudio
Part A Statistical Programming at Oxford
DataCamp tutorial
Coursera R programming course
Intro to tidyverse (advanced)

Background Review Aids

Matrix and Gaussian identities - short useful reference for machine learning.
Linear Algebra Review and Reference - useful selection for machine learning.
Video reviews on Linear Algebra by Zico Kolter.
Video reviews on Multivariate Calculus and SVD by Aaditya Ramdas.
The Matrix Cookbook - extensive reference.