# Algorithmic Foundations of Learning

#### Prof. Patrick Rebeschini, University of Oxford, Michaelmas Term 2020

Course offered to: Part C, MSc in Mathematical Sciences (OMMS), and MSc in Statistical Science

## General Information

Instructor | Patrick Rebeschini (patrick.rebeschini AT stats.ox.ac.uk) |
---|---|

Class Tutors and Teaching Assistants | Tomas Vaškevičius (tomas.vaskevicius AT spc.ox.ac.uk) Fan Wu (fan.wu AT spc.ox.ac.uk) |

Lectures | Available online in Canvas |

Tutorials | Week 2, 4, 6, 8 of Michaelmas Term |

## Tutorials

Group | Time | Location | Class Tutor / Teaching Assistant |
---|---|---|---|

Part C and OMMS, set 1 | Mondays 11:30-13:00 Week 2, 4, 6, 8 | Social Area | Patrick Rebeschini / Fan Wu |

Part C and OMMS, set 2 | Mondays 14:00-15:30 Week 2, 4, 6, 8 | Social Area | Patrick Rebeschini / Fan Wu |

Part C and OMMS, set 3 | Tuesdays 9:30-11:00 Week 2, 4, 6, 8 | Social Area | Tomas Vaškevičius |

MSc in Statistical Science | Thursdays 9:00-10:00 Week 2, 4, 6, 8 | IT, LLT, SLT, S | Patrick Rebeschini |

## Syllabus

The course is meant to provide a rigorous theoretical account of the main ideas underlying machine learning, and to offer a principled framework to understand the algorithmic paradigms being used, along with non-asymptotic methods for the study of random structures in high-dimensional probability, statistics, and optimisation.- Statistical learning frameworks for prediction, estimation, and online learning.
- Probability:
- Maximal inequalities.
- Rademacher and Gaussian complexities.
- Elements of VC theory.
- Covering and packing numbers.
- Chaining.
- Concentration inequalities.
- Statistics:
- Bayes decision rules.
- Empirical risk minimisation. Error decomposition: generalisation, optimisation, and approximation.
- Learning via uniform convergence, margin bounds, and algorithmic stability.
- Regularisation: explicit (constraints and penalisation) and implicit (algorithmic).
- Convex loss surrogates.
- Slow and fast rates.
- Minimax lower bounds and hypothesis testing.
- Optimisation:
- Elements of convex theory.
- Oracle model. Gradient descent. Mirror descent.
- Stochastic oracle model. Stochastic gradient descent. Stochastic mirror descent. Variance reduction techniques.
- Approximate Message Passing.
- Online optimisation.
- Examples:
- Linear predictors, including Boosting.
- Non-linear predictors, including Support Vector Machines and Neural Networks.
- High-dimensional estimators for sparse and low-rank problems, including Lasso.
- Online learning, including multi-armed bandit problems, reinforcement learning and algorithms.

- Shai Shalev-Shwartz and Shai Ben-David.
*Understanding Machine Learning: From Theory to Algorithms.*Cambridge University Press, 2014. - Sébastien Bubeck.
*Convex Optimization: Algorithms and Complexity.*Foundations and Trends in Machine Learning, 2015. - Ramon van Handel.
*Probability in High Dimension.*Lecture notes available online, 2016. - Tor Lattimore and Csaba Szepesvári.
*Bandit Algorithms.*Book available online (draft). - Martin J. Wainwright.
*High-Dimensional Statistics. A Non-Asymptotic Viewpoint.*Cambridge University Press, 2019.

- The course requires a good level of mathematical maturity. Students are expected to have a working knowledge of core concepts in probability theory (basic properties of probabilities, such as union bounds, and of conditional expectations, such as the tower property; basic inequalities, such as Markov's and Jensen's), statistics (confidence intervals, hypothesis testing), and linear algebra (matrix-vector operations, eigenvalues and eigenvectors; basic inequalities, such as Cauchy-Schwarz's and Hölder's). Previous exposure to machine learning (empirical risk minimisation, overfitting, regularisation) is recommended.
- Students would benefit from being familiar with the material covered in SB2.1 (formerly SB2a) Foundations of Statistical Inference (in particular, Decision Theory) and in SB2.2 (formerly SB2b) Statistical Machine Learning.

## Lecture Material

Lectures have been pre-recorded and are available online in Canvas.The lecture notes are self-contained. They give full details of the results and proofs covered in this course, and include all the material needed to complete the problem sheets and the exams.

The slides provide a high-level narrative and highlight the main ideas of the results covered in the lecture notes.

Proof ideas and graphical illustrations are given during lectures.

**Students are expected to study all the proofs presented in the lecture notes, including those not included in the slides.**

0 | Preliminaries | Notes » |
---|---|---|

1 | Introduction | Notes » Slides » Video » |

2 | Maximal Inequalities and Rademacher Complexity | Notes » Slides » Video » |

3 | Rademacher Complexity. Examples | Notes » Slides » Video » |

4 | VC Dimension. Covering and Packing Numbers | Notes » Slides » Video » |

5 | Covering Numbers Bounds for Rademacher Complexity. Chaining | Notes » Slides » Video » |

6 | Sub-Gaussian Concentration Inequalities. Bounds in Probability | Notes » Slides » Video » |

7 | Bernstein’s Concentration Inequalities. Fast Rates | Notes » Slides » Video » |

8 | Convex Loss Surrogates. Elements of Convex Theory | Notes » Slides » Video » |

9 | Oracle Model. Gradient Descent Methods | Notes » Slides » Video » |

10 | Non-Euclidean Settings. Mirror Descent Methods | Notes » Slides » Video » |

11 | Stochastic Oracle Model. Algorithmic Stability | Notes » Slides » Video » |

12 | High-Dimensional Statistics. Gaussian Complexity | Notes » Slides » Video » |

13 | The Lasso Estimator. Proximal Gradient Methods | Notes » Slides » Video » |

14 | Least Squares Regression. Implicit Bias and Implicit Regularization | Notes » Slides » Video » |

15 | Stochastic Multi-Armed Bandit Problem and Algorithms | Notes » Slides » Video » |

16 | Minimax Lower Bounds and Hypothesis Testing | Notes » Slides » Video » |

## Problem Sheets

Problem Sheet 1 | Problems on material covered up to Lecture 2 | Sheet » due on Sunday, October 18th 2020, at 23:00 |
---|---|---|

Problem Sheet 2 | Problems on material covered up to Lecture 7 | Sheet » due on Friday, October 30th 2020, at 23:00 |

Problem Sheet 3 | Problems on material covered up to Lecture 11 | Sheet » due on Friday, November 13th 2020, at 23:00 |

Problem Sheet 4 | Problems on material covered up to Lecture 16 | Sheet » due on Friday, November 27th 2020, at 23:00 |

## Office Hours

A 30 minute Office Hour is offered every two weeks via Microsoft Teams. Details are available in Canvas.## Exams and Revision Sessions (Trinity Term)

The exams will be held online in Trinity Term 2020-21 in an open book format, further information can be found here.Part C/OMMS exam papers and solutions can be found on WebLearn, following this link (searching for "Algorithmic Foundations of Learning") and this link (Part C -> SC10), respectively. MSc exam papers and solutions can be found on WebLearn, following this link (searching for "Principles of Statistical Analysis") and this link, respectively.

A selection of previous exam questions will be covered during revision classes in Trinity Term.

# Feedback

Leave your feedback to the instructors, anonymously: click here