SDMML HT 2016

Statistical Data Mining and Machine Learning 
Term: 
Hilary Term 2016, Jan 18  Mar 11 
Lectures: 
Statistics LG.01, Mon 1415, Wed 1011 
MSc Classes: 
Statistics LG.01, Tue 1213 (weeks 2,4,8); LG.02, Tue 1213 (week 6) 
MSc Practicals: 
Statistics LG.02, Fri 1416 (weeks 5 and 8); Deadline for Week 8 Practical will be Tue noon Week 9 (March 15th) 
Part C Class Tutors: 
Jovana Mitrovic and Leonard Hasenclever 
Part C Problem Sheets: 
due on Wednesdays 10am (weeks 38) 
Part C Classes: 
Statistics LG.04, Group I: Fri 1415 (weeks 38), Group II: Fri 1516 (weeks 38) 
TT Part C Revision: 
Statistics LG.03, Wed TT Week 4, 1112 (2014 paper) 

Statistics LG.03, Thu TT Week 5, 1516 (2015 paper) 
Course Materials
 Lecture 1 (18/01): Introduction, Principal Components Analysis
screen version,
print version
 Lecture 2 (20/01): Biplots, Multidimensional Scaling
screen version,
print version
 Lecture 3 (25/01): MDS, Isomap, Clustering
screen version,
print version
 Lectures 4 & 5 (27/01 & 01/02): Probabilistic Unsupervised Learning, EM Algorithm
screen version,
print version
 Lecture 6 (03/02): Introduction to Supervised Learning, Linear Discriminant Analysis
screen version,
print version
 Lecture 7 (08/02): ReducedRank LDA, QDA
screen version,
print version
 Lecture 8 (10/02): Naive Bayes, Logistic Regression
screen version,
print version
 Lectures 9 & 10 (15/02 & 17/02): Support Vector Machines, Kernel Trick
screen version,
print version
 Lecture 11 (22/02): Kernel Methods, Smoothing and Nearest Neighbours
screen version,
print version
 Lecture 12 (24/02): Neural Networks
screen version,
print version
 Lecture 13 (29/02): Evaluation, Validation, Model Selection
screen version,
print version
 Lecture 14 & 15 (02/03 & 07/03): Decision Trees, Ensemble Methods
screen version,
print version (section on Boosting updated on 06/03)
 Lecture 16 (09/03): Bayesian Learning
screen version,
print version
Problem Sheets
Part C
MSc
Textbooks and Background Reading
Recommended textbooks on statistical data mining and machine learning:

Hastie, Tibshirani and Friedman, The Elements of Statistical Learning, Springer.
[ebook]

James, Witten, Hastie and Tibshirani, An Introduction to Statistical
Learning with Applications in R, Springer.
[ebook]

Ripley, Pattern Recognition and Neural Networks, Cambridge University Press.

Bishop, Pattern Recognition and Machine Learning, Springer.

Murphy, Machine Learning: A Probabilistic Perspective, MIT Press.
Background Review Aids:
R
You will need to use the R statistical programming language and
environment for this course.