The complex behaviour of single cells such as Amoebae and ciliates occurs in the absence of a nervous system and must therefore be based on computational circuits embedded in the cytoplasm. An even greater capacity for computation underlies the biochemical processes of a cell as it grows, divides and undergoes differentiation. It has been demonstrated that small groups of enzymes and other proteins within the cell perform logical operations such as amplification, integration and coincidence detection. Similarly, many aspects of the biochemistry and physiology of living cells can be analyzed by simulating networks of protein-based reactions on a computer. However living circuitry differs in fundamental respects to silicon devices and has unique features such as a highly malleable internal architecture and an existence of a multitude of molecular states that we cannot yet reproduce.
^ back up ^Navigating the very extensive data now to be found in a large number of different genomics databases is hindered by the extraordinary semantic confusion in the manner in which objects are named. I will discuss the Open Biology Ontology effort, which is introducing structured controlled vocabularies, which will both overcome this problem and greatly increase database functionality.
^ back up ^Proteomics should be the most useful of all of the levels of functional genomic analysis. However, it is beset with technical difficulties, which mean that it currently neither a truly high-throughput nor a comprehensive approach. These two lectures will explore both biochemical and genetic approaches to proteome analysis. Problems of data capture and interpretation will be considered, as will the integration of proteomic data with those from transcriptome and metabolome analyses. Finally, likely future trends in proteomics research will be discussed.
^ back up ^Traditionally, the metabolic pathways that occur in any given organism were laboriously mapped by biochemists and microbial physiologists. Increasingly, however, there are organisms for which we have genome sequences but little knowledge of their metabolic biochemistry. To what extent can we recreate the metabolic phenotype from a genome sequence? Detecting the ORFs and assigning an initial annotation turns out to be only the beginning of the problem. In principle, if the annotation was accurate and complete, we would have a list of enzymatic reactions constituting the maximum metabolic network the organism could express, ready for automated analysis. But current databases are not yet capable of delivering a list that does not require significant human intervention to make it usable. Some of the reason for this will be explored, together with methods for detecting inconsistencies and gaps in the genome annotations.
^ back up ^Knowledge of structures are closer to functional biology and understanding of biological systems, than DNA and protein sequences and thus in some sense of greater value. Information on structures used to lag far behind sequence information. However, in recent years due to technological advances and the advent of large scale projects to determine structures this has radically changed and there is now a large and growing set of known structures. This is a major resource for the biosciences, but also an opportunity for answering structural, functional and evolutionary questions. Why are certain folds observed in nature, while others are absent? How are the function described and characterized for structures determined by large scale projects? How are classical issues from sequence analysis, such as homology testing, statistical modelling of evolution carried over to the much more complex data of structures?
^ back up ^Molecular dynamics simulations provide insights into the conformational dynamics of proteins on a 10 to 50 ns timescale. The uses of such simulations will be illustrated via their application to membrane proteins. As MD simulations become more widespread, there is a need to develop improved tools for storage, interrogation and analysis of simulation data. The BioSimGRID project (www.biosimgrid.org) aims to provide such tools, making simulation results available to a wider biological community.
^ back up ^Since the discovery of the structure of DNA in 1953, it has been obvious that having the complete DNA sequence of an organism provides a foundation for understanding biology. Just fifty years later, we are now able to look up the sequences of man (approximately 3 billion bases), and many other species on the internet and use them as the basis for experiments to understand biological processes and evolution. In this lecture I will describe some of the key developments that have brought about the revolution in sequence data generation and distribution, and discuss the prospects for the future.
^ back up ^The measurement of mRNA concentration has over the last decade risen to prominence in molecular biology and medicine. It is both used to classify cells as a diagnostic tool, since different cell types and cancers have characteristic levels of gene expression and as a means to probe the dynamics of the cell under different circumstances. However, expression data are also hard to analyze in a statistical rigorous way: It is high dimensional, highly correlated and noise data. The lecture will trace the experimental techniques, the statistical methods, key biological results and with a view toward the future.
^ back up ^