RESEARCH INTERESTS Molecular Evolution Molecular Population Genetics Bioinformatics I am a member of the Bioinformatics Group I am interested in Molecular Evolution, Molecular Population Genetics, Bioinformatics and Computational Biology. At present I am working on the topics sketched below: Statistical alignment The approach taken to alignment by Thorne, Kishino and Felsenstein (TKF) is in my view superior to other approaches, but seriously remains to be developed to be practical for actual data analysis. My main goals for the near future is: A tractable time-reversible model allowing for longer insertions/deletions. Combining a profile Hidden Markov Model (HMM) with the TKF model.=20 This leads to interesting problems as the HMM will have segments added and deleted in the course of evolution. Hein (2001) gives an algorithm that can analyze a set of sequences related by a binary tree and evolving by the TKF model. It is not practical for real data. In collaboration with Jens Ledet Jensen and Kim Mouridsen were work on practical methods based on MCMC techniques. Coalescent Theory Including realistic molecular models of Gene Conversion and Recombination in the Coalescent Model. Interesting questions in this context is how much population data is needed to distinguish different molecular mechanisms. It should also be of relevance for fine scale gene mapping. Methods of evolutionary analysis of sequence that experience Recombination and Gene Conversion. Stochastic Grammars and Molecular Evolution Stochastic Grammars are very flexible tools to describe structural relationships in biology, such as secondary structures in proteins and RNA (and much more). Goldman, Thorne & Jones were the first to combine Stochastic Grammars with Molecular Evolution. We have since initiated three new applications of this: Stochastic Context Free Grammars, Molecular Evolution and RNA Secondary Structure, where Bjarne Knudsen is the main contributor. This started in the summer of 98 and has been a major success. Gene Finding and Molecular Evolution, where Jakob Skou Pedersen is the main contributor. This started in February 2000 and is very promising. Especially as many closely related genomes are available, this will be very useful. I hope our next project will be to devise a Viral Gene Finder that uses alignments of large number or viral genomes to find the reading frames. Other Projects I am in minor way involved in a series of other projects, three of which are: The measurement of absulute evolutionary rates in viruses, when the viruses have been sequenced at different time points. The main contributors to this project are Roald Forsberg and Anne-Mette Hein. Metrics on trees based on recombination events. This posed a major problem in RecPars, where a heuristic had to be used, since an exact algorithm was too slow. An overlooked problem in this context, is that real recombinations operate on rooted trees, while people often unroot their trees. Thomas Christensen has found an example where the metric on rooted trees are different from the metric on unrooted trees. It involves 2 rooted trees with 9 leaves that are 3 recombination events apart. If the trees are unrooted, their distance is only 2 recombinations. Beyond these topics, I supervise students in molecular evolution, viral evolution and sequence algorithms. I normally expand into new areas by including new subjects in the courses offered. At present I wish to put more emphasis on: Metabolic Pathways. I started teaching this in 98. Expression Data and Modelling Regulatory Networks. I started teaching this in 99. Gene Mapping and Coalescent Theory. Teaching I have started most of the courses that are being taught at AAU in Bioinformatics, Molecular Evolution and Molecular Population Genetics, but I have continuously expanded my repertoire and others have continued or taken over, so the total amount of courses being offered by the group is now very large. The course that I have taught or initiated can be found on the Bioinformatics Research Center pages at Aarhus University. Software The software described is all written in C. TreeAlign - was written by me in the period 85-89. It aligns and finds the phylogeny at the same time for a set of homologous proteins or DNA/RNA sequences. GenAl - was written by Jens Støvlbæk (92/93) in collaboration with me. Can align pairs of DNA sequences with an arbitrary set of reading frames (including overlapping). GenAl is design to compare homologous viral genomes. Mulal - was written by Jens Støvlbæk (93/94) in collaboration with me. Made an evolutionary analysis of aligned pairs of viral genomes, in terms of transition/transversion and nucleotide composition bias and selective constraints. It could handle overlapping reading frames. RecPars - by me in 90, but a superior version was programmed by Kim Fisker in 94. It tries to find the most parsimonious history of a set of sequences in terms of substitutions and recombinations. It is designed to analyze aligned viral genomes. Spatial - by me in 95/96, but substantially improved by Mikkel Nygård in 98. Does exactly the same as Hudson's 83 algorithm, but by scanning along sequences instead of going back in time. It will generate recombination-genealogies for sequences sample from an idealized population. It can also generate data if a substitutional process is specified. Ancestors - by me in 95/96. Simulates the fate of one sequence going back in time, tracking the number of ancestors and ancestral segments. This has been used to estimate the number of genetic ancestors to present the present human population (with Carsten Wiuf). Some of this software is publically available here.

spires.gif (7312 bytes)

| department of statistics |