Further Statistical Methods - Lectures 4 and 5

Missing data and the EM algorithm

Missing data

These lectures are concerned with problems associated with missing data, in particular we analyse different types of missingness, and different methods and models dealing with missingness, including the important notions of MCAR (missing completely at random) and MAR (missing at random)..

The EM algorithm is a clever algorithm which can be used to maximize the likelihood function based on the observed data, ignoring the missing data mechanism. This is the correct likelihood to use if the data are MAR and the parameters for the missing data  mechanism are separate from the parameters of interest.

The ultimate book covering missing data methods is

R. J. A. Little and D. B. Rubin (2002). Statistical Analysis with Missing Data, 2nd ed. Wiley, New York.

These lectures  are covering material in the Introduction and in Chapter 6.2 (Ch 5 in 1 edition) as well as Ch 8 (Ch 7 in old edition).

This little script implements the steps of the EM algorithm in the special case of normal mixtures with known variance and one of the mixture components known.

Overheads

screen viewing

printing

Links

Next lecture

Previous lecture

Course overview


Last updated: Monday, 05 February 2007 18:18Steffen L. Lauritzen