Term: | Hilary Term 2018, Jan 14 - Mar 10 |

Lectures: | Statistics LG.01, Wed 12-13, Fri 10-11 |

MSc Classes: | HT weeks 3,5,7 and TT week 2. Monday 4-5pm in LG.01. |

MSc Practicals: | Week 4 (unassessed) and week 8 (group-assessed, teams of 4) |

Part B Class tutors and TAs: | Tutors: Pier Francesco Palamara and Kaspar Märtens, TAs: Juba Nait Saada, Brian Zhang, and Fergus Boyles. |

Part B Classes: | HT weeks 3,5,7 and TT week 1. Monday 9:00am-10:30am (Pier+Fergus), 11am-12.30pm (Pier+Juba), and 1.30pm-3pm (Pier+Juba) in LG.04. Tuesday at 10:15am-11.45pm (Kaspar+Fergus) in LG.05, and 1.15pm-2:45pm (Kaspar+Brian) in LG.04. |

Past exam questions for revision:

- Part B 2017 Q1, Q2 (excluding c), Q3
- Part C 2016 Q1 (a-b), Q2 (a-b)
- Part C 2015 Q1, Q2 (a-c), Q3 (b)
- MSc 2017 Q7, Q8 (excluding b-c)
- MSc 2016 Q6 (a-b), Q7
- MSc 2014 Q7
- MSc 2012 Q6, Q7

(To make the consultation sessions more efficient,

- Week 2 Wed 10:30am-11:30am LG01 - revision class (Kaspar)
- Tentative plan: 2015 PartC Q2 (abc); 2016 PartC Q1 (a); 2017 MSc Q7 (cd)
- Week 3 Fri 09:00am-10:30am LG04 - consultation (Pier)
- Week 4 Thu 09:00am-10:00am LG01 - revision class (Pier)
- Tentative plan: a subset of { 2016 PartC Q1 (b) or 2014 MSc Q7 (c); 2015 PartC Q1 (c); 2016 MSc Q7 (b-c); 2017 MSc Q8 (d) }
- Week 5 Wed 09:00am-10:30am LG04 - consultation (Kaspar)

- Problem Sheet 1 and solutions.
- Problem Sheet 2 and solutions.
- MSc unassessed practical (and solutions) has been posted on Weblearn
- Problem Sheet 3 and solutions.
- MSc assessed practical has been posted on Weblearn
- Problem Sheet 4 and solutions.

Slides will be made available as the course progresses and periodically updated. Please check back for updates.

- Lecture 1 (17/01):
**Introduction, unsupervised learning: exploratory data analysis**

slides - Lecture 2 (19/01):
**Principal component analysis, SVD**

slides - Lecture 3 (24/01):
**Biplots, multidimensional scaling, Isomap**

slides - Lecture 4 (26/01):
**Clustering (K-means, hierarchical)**

slides - Lecture 5 (31/01):
**Supervised learning: empirical risk minimization, regression (linear, polynomial), overfitting**

slides - Lecture 6 (02/02):
**Generalization, cross-validation, bias-variance trade-off**

slides - Lecture 7 (07/02):
**Regularization, gradient descent, classification: the Bayes classifier**

slides, regularization demo, source, data - Lecture 8 (09/02):
**Classification: linear and quadratic discriminant analysis (LDA, QDA), FDA**

slides - Lecture 9 (14/02):
**QDA (cont.), logistic regression**

slides - Lecture 10 (16/02):
**Logistic regression (cont.), generative vs. discriminative learning, naïve Bayes**

slides - Lecture 11 (21/02):
**Naïve Bayes (cont.), K-nearest neighbors, evaluating performance, ROC curves.**

slides - Lecture 12 (23/02):
**Decision trees**

slides - Lecture 13 (28/02):
**Neural networks**

slides - Lecture 14 (02/03):
**Deep learning (guest lecture by Yee Whye Teh, Oxford/DeepMind)**

slides - Lecture 15 (07/03):
**Bagging, random forests**

slides - Lecture 16 (09/03):
**Boosting**

slides

- Installation
- RStudio
- Part A Statistical Programming at Oxford
- DataCamp tutorial
- Coursera R programming course
- Intro to tidyverse (advanced)

- Matrix and Gaussian identities - short useful reference for machine learning.
- Linear Algebra Review and Reference - useful selection for machine learning.
- Video reviews on Linear Algebra by Zico Kolter.
- Video reviews on Multivariate Calculus and SVD by Aaditya Ramdas.
- The Matrix Cookbook - extensive reference.