The University of Oxford
24-29 St. Giles',
Oxford OX1 3LB
Email: sharp < at > stats.ox.ac.uk
I read physics as an undergraduate at the University of Cambridge and subsequently worked for some years as a professional singer before doing my PhD in Machine Learning with Magnus Rattray at The University of Manchester.
My research focuses on developing and applying statistical and computational methods to address important problems in statistical genetics and computational biology. In particular, I am interested in applying ideas from machine learning and computational statistics to help to identify the heritable factors and environmental exposures which influence complex traits and human diseases. Recently, this work has embraced new methods for multivariate association studies (SBAT), improved methods of phasing in the context of large sample sizes and moderate to high coverage sequencing data, and detection of SNPs associated with phenotypes primarily through interaction with other genetic variants (GPMM).
More broadly, my interests include many aspects of machine learning, computational statistics and computational biology. I am particularly interested in all aspects of Bayesian inference including the development of improved approximate inference algorithms.
Nature volume 562, pages 210‐216 (2018) [bioRxiv]
Nature volume 562, pages 203‐209 (2018) [bioRxiv]
Bioinformatics Volume 32, Issue 13, 1 July 2016, Pages 1974 - 1980,
Nature Genetics volume 48, pages 817 - 820 (2016),
Nature Genetics volume 48, pages 1279 - 1283 (2016),
in Y.W. Teh and M. Titterington (Eds.), Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS) 2010, JMLR: W&CP 9, pp 725-732 , Chia Laguna, Sardinia, Italy, May 13-15, 2010,
Journal of Physics: Conference Series, 197:012002 (10pp), 2009.
PhD Thesis [pdf]
SBATA C++ program for performing Sparse Bayesian Association Testing that can fit both full and sparse multi-trait linear mixed models. It also computes both single phenotype heritability estimates and coheritability matrices for groups of traits. An executable binary and documentation are available here .
Oxford Phasing ServerA service for accurately phasing sequenced samples using large haplotype reference panels, such as the Haplotype Reference Consortium dataset. The backend of this server implements the phasing algorithm described in Sharp et al. 2016 The server may be accessed here .
GPMMwill be available soon as a compiled CUDA/C executable for running on a Linux platform with an Nvidia CUDA-enabled Graphics Processing Unit.
DMPMatlab code implementing the Dense Message Passing algorithm described in the AIStats 2010 paper mentioned above is available here.
This download includes everything required to reproduce the results in the paper for the dense message passing algorithm applied to synthetic data. Please see the README.txt file within the download for instructions.
To reproduce the gene expression data results and the results for the other algorithms you will also need to download data and code from the appropriate websites. Please see the README.txt file within the download for instructions about how to obtain these.
If you have any problems, please email me. I will be happy to help.
Poster talk (ASHG 2018.)
(Bertinoro Computational Biology 2014) [Slides]
(AIStats 2010). A video of the talk is available at videolectures.net
(SuSTaIn workshop: Sparse structures: statistical theory and practice) [Slides]