"Corner Cutting" Approaches to the Ethier-Griffiths-Tavare Recursions

Calculating the likelihood of a set of sequences sampled from a population is important in order to estimate selection, recombination, mutation and population structure. In a simplified model of sequences called the infinite sites model (Kimura and Otha, 1971) calculating this likelihood is possible in a basic population model thanks to a series of recursions published by Ethier, Griffiths and Tavare in the period 1987-95. In this project proposal we propose to explore a simple idea of how to extend and accelerate the fundamental recursions. In a recent paper, Lyngsų, Song and Hein (2008) accelerated likelihood calculations for the case including recombinations by a very large factor (106-109) by only considering ancestral states reachable in less than k recombinations. Unfortunately, an even larger factor is needed to make the algorithm practical. However, very similar ideas can be applied to the model without recombination but where a DNA model (not infinite sites) of sequences is applied.