Algorithmic Foundations of Learning

Prof. Patrick Rebeschini, University of Oxford, Hilary Term 2022

Course offered to: Part C, MSc in Mathematical Sciences (OMMS), and MSc in Statistical Science



General Information

Instructor Patrick Rebeschini (patrick.rebeschini AT stats.ox.ac.uk)
Class Tutors and Teaching Assistants Tyler Farghly (tyler.farghly AT keble.ox.ac.uk)
Carlo Alfano (carlo.alfano AT linacre.ox.ac.uk)
Lectures Mondays 9:00-10:00 & Wednesdays 9:00-10:00, Weeks 1-8, LG.01. Recordings available online on Canvas
Classes (Tutorials) Part C and OMMS: Weeks 3, 5, 7 of Hilary Term and Week 1 of Trinity Term
MSc: Weeks 2, 4, 6, 8 of Hilary Term


Classes (Tutorials)

Group Time Location Class Tutor / Teaching Assistant
Part C and OMMS, set 1 Thursdays 14:00-15:30 Week 3, 5, 7 and Week 1 (Trinity Term) LG.01 Patrick Rebeschini / Carlo Alfano
Part C and OMMS, set 2 Thursdays 15:30-17:00 Week 3, 5, 7 and Week 1 (Trinity Term) LG.01 Patrick Rebeschini / Carlo Alfano
Part C and OMMS, set 3 Fridays 9:00-10:30 Week 3, 5, 7 and Week 1 (Trinity Term) LG.03 Tyler Farghly
MSc in Statistical Science Tuesdays 10:30-12:00 Week 2, 4, 6, 8 LG.01 Patrick Rebeschini

Syllabus

The course is meant to provide a rigorous theoretical account of the main ideas underlying machine learning, and to offer a principled framework to understand the algorithmic paradigms being used, along with non-asymptotic methods for the study of random structures in high-dimensional probability, statistics, and optimisation.
  • Statistical learning frameworks for prediction, estimation, and online learning.
  • Probability:
    • Maximal inequalities.
    • Rademacher and Gaussian complexities.
    • Elements of VC theory.
    • Covering and packing numbers.
    • Chaining.
    • Concentration inequalities.
  • Statistics:
    • Bayes decision rules.
    • Empirical risk minimisation. Error decomposition: generalisation, optimisation, and approximation.
    • Learning via uniform convergence, margin bounds, and algorithmic stability.
    • Regularisation: explicit (constraints and penalisation) and implicit (algorithmic).
    • Convex loss surrogates.
    • Slow and fast rates.
    • Minimax lower bounds and hypothesis testing.
  • Optimisation:
    • Elements of convex theory.
    • Oracle model. Gradient descent. Mirror descent.
    • Stochastic oracle model. Stochastic gradient descent. Stochastic mirror descent. Variance reduction techniques.
    • Online optimisation.
  • Examples:
    • Linear predictors, including Boosting.
    • Non-linear predictors, including Support Vector Machines and Neural Networks.
    • High-dimensional estimators for sparse and low-rank problems, including Lasso.
    • Online learning, including multi-armed bandit problems, reinforcement learning and algorithms.
Readings:
  • Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014.
  • Sébastien Bubeck. Convex Optimization: Algorithms and Complexity. Foundations and Trends in Machine Learning, 2015.
  • Ramon van Handel. Probability in High Dimension. Lecture notes available online, 2016.
  • Tor Lattimore and Csaba Szepesvári. Bandit Algorithms. Book available online (draft).
  • Martin J. Wainwright. High-Dimensional Statistics. A Non-Asymptotic Viewpoint. Cambridge University Press, 2019.
Prerequisites:
  • The course requires a good level of mathematical maturity. Students are expected to have a working knowledge of core concepts in probability theory (basic properties of probabilities, such as union bounds, and of conditional expectations, such as the tower property; basic inequalities, such as Markov's and Jensen's), statistics (confidence intervals, hypothesis testing), and linear algebra (matrix-vector operations, eigenvalues and eigenvectors; basic inequalities, such as Cauchy-Schwarz's and Hölder's). Previous exposure to machine learning (empirical risk minimisation, overfitting, regularisation) is recommended.
  • Students would benefit from being familiar with the material covered in SB2.1 (formerly SB2a) Foundations of Statistical Inference (in particular, Decision Theory) and in SB2.2 (formerly SB2b) Statistical Machine Learning.


Lecture Material

Lectures have been pre-recorded and are available online on Canvas.

The lecture notes are self-contained. They give full details of the results and proofs covered in this course, and include all the material needed to complete the problem sheets and the exams.
The slides provide a high-level narrative and highlight the main ideas of the results covered in the lecture notes.
Proof ideas and graphical illustrations are given during lectures.
Students are expected to study all the proofs presented in the lecture notes, including those not included in the slides.

0 Preliminaries Notes »
1 Introduction Notes »  Slides »  Video »
2 Maximal Inequalities and Rademacher Complexity Notes »  Slides »  Video »
3 Rademacher Complexity. Examples Notes »  Slides »  Video »
4 VC Dimension. Covering and Packing Numbers Notes »  Slides »  Video »
5 Covering Numbers Bounds for Rademacher Complexity. Chaining Notes »  Slides »  Video »
6 Sub-Gaussian Concentration Inequalities. Bounds in Probability Notes »  Slides »  Video »
7 Bernstein’s Concentration Inequalities. Fast Rates Notes »  Slides »  Video »
8 Convex Loss Surrogates. Elements of Convex Theory Notes »  Slides »  Video »
9 Oracle Model. Gradient Descent Methods Notes »  Slides »  Video »
10 Non-Euclidean Settings. Mirror Descent Methods Notes »  Slides »  Video »
11 Stochastic Oracle Model. Algorithmic Stability Notes »  Slides »  Video »
12 High-Dimensional Statistics. Gaussian Complexity Notes »  Slides »  Video »
13 The Lasso Estimator. Proximal Gradient Methods Notes »  Slides »  Video »
14 Least Squares Regression. Implicit Bias and Implicit Regularization Notes »  Slides »  Video »
15 Stochastic Multi-Armed Bandit Problem and Algorithms Notes »  Slides »  Video »
16 Minimax Lower Bounds and Hypothesis Testing Notes »  Slides »  Video »


Problem Sheets

Problem Sheet 1 Problems on material covered up to Lecture 2 Sheet »  due 48 hours before class
Problem Sheet 2 Problems on material covered up to Lecture 7 Sheet »  due 48 hours before class
Problem Sheet 3 Problems on material covered up to Lecture 11 Sheet »  due 48 hours before class
Problem Sheet 4 Problems on material covered up to Lecture 16 Sheet »  due 48 hours before class
Type A and type B questions cover examinable material. Type C questions are optional and not examinable.

Part C and OMMS students: upload a pdf of your solutions on Canvas by the assignment deadline. Only solutions received before the deadline will be graded by TAs and returned to you via Canvas. Model solutions will be made available on Canvas prior to tutorials.


Q&A Sessions (only for MSc students)

Two online-only Q&A sessions will be offered to MSc students (Part C/OMMS students should not take part in them) on Tuesdays, Week 3 and Week 7, from 10:30 to 11:30. Teams invitations and/or online links will be sent out to students. Details will be available on Canvas.


Exams and Revision Sessions (Trinity Term)

The exam in Trinity Term 2022 will be in person, partially open book.  To explain more fully: you will sit your exams in person in rooms provided by the University, with invigilators present.  You will be permitted to bring a couple of pages of notes, prepared by yourself, with adjustments in cases of disability. Precise guidelines will be specified and confirmed in the Examiners' notices to candidates.

Part C/OMMS exam papers and solutions can be found on WebLearn, following this link (searching for "Algorithmic Foundations of Learning") and this link (section "SC10 Algorithmic Foundations of Learning"), respectively. MSc exam papers and solutions can be found on WebLearn, following this link (searching for "Principles of Statistical Analysis") and this link, respectively.

A selection of previous exam questions will be covered during revision classes in Trinity Term.

Feedback

Leave your feedback to the instructors, anonymously:   click here