Computational Promoter Analysis of Metazoan α-Globins

Although it is now possible to identify and annotate coding sequences from primary DNA sequence relatively efficiently, it is still a significant challenge to recognise and categorise the wide range of cis-acting regulatory elements by computational analysis alone (e.g. promoters, enhancers, locus control regions, boundary elements and silencers) that control gene expression. Recently this has become more tractable by comparing orthologous, non-coding sequences from a variety of species separated by a wide evolutionary time scale (~50-500 MYs); Prakash and Tompa (2005). It has emerged that many multi-species conserved regulatory elements (MRS-R) contain highly conserved transcription factor (TF) binding sites. However, it is still not clear why these sequences (often as few as 6-10 bases) bind TFs in their chromosomal context, while other apparently identical, and yet non-conserved binding sites do not. It suggests that there is more to these MCS-R elements than meets the eye and indeed closer inspection shows that sequences surrounding the conserved TF binding sites may also be relatively well conserved although not to the same degree as seen for the TF binding sites themselves.