Department of Statistics
The University of Oxford
24-29 St. Giles',
Oxford OX1 3LB

Email: sharp < at >

I am a senior postdoc in the group of Jonathan Marchini in the Department of Statistics in the University of Oxford.

I read physics as an undergraduate at the University of Cambridge and subsequently worked for some years as a professional singer before doing my PhD in Machine Learning with Magnus Rattray at The University of Manchester.

My research focuses on developing and applying statistical and computational methods to address important problems in statistical genetics and computational biology. In particular, I am interested in applying ideas from machine learning and computational statistics to help to identify the heritable factors and environmental exposures which influence complex traits and human diseases. Recently, this work has embraced new methods for multivariate association studies (SBAT), improved methods of phasing in the context of large sample sizes and moderate to high coverage sequencing data, and detection of SNPs associated with phenotypes primarily through interaction with other genetic variants (GPMM).

More broadly, my interests include many aspects of machine learning, computational statistics and computational biology. I am particularly interested in all aspects of Bayesian inference including the development of improved approximate inference algorithms.


Kevin Sharp, Valentina Iotchkova, Jonathan Marchini: Sparse Bayesian Modelling for Multitrait Genetic Association Studies, 2018
in review.
Lloyd Elliott, Kevin Sharp, Fidel Alfaro-Almagro, Sinan Shi, Gwenaelle Douaud, Karla Miller, Jonathan Marchini, Stephen Smith: Genome-wide association studies of brain imaging phenotypes in UK Biobank,
Nature volume 562, pages 210‐216 (2018) [bioRxiv]
Clare Bycroft, Colin Freeman, Desislava Petkova, Gavin Band, Lloyd T Elliott, Kevin Sharp, Allan Motyer, Damjan Vukcevic, Olivier Delaneau, Jared O'Connell, Adrian Cortes, Samantha Welsh, Gil McVean, Stephen Leslie, Peter Donnelly, Jonathan Marchini: The UK Biobank resource with deep phenotyping and genomic data,
Nature volume 562, pages 203‐209 (2018) [bioRxiv]
K. Sharp, O. Delaneau, J. Marchini: Phasing for medical sequencing using rare variants and large haplotype reference panels,
Bioinformatics Volume 32, Issue 13, 1 July 2016, Pages 1974 - 1980,
Jared O' Connell, Kevin Sharp, Nick Shrine, Louise Wain, Ian Hall, Martin Tobin, Jean-Francois Zagury, Olivier Delaneau, Jonathan Marchini: Haplotype estimation for biobank scale datasets,
Nature Genetics volume 48, pages 817 - 820 (2016),
The Haplotype Reference Consortium : A reference panel of 64,976 haplotypes for genotype imputation,
Nature Genetics volume 48, pages 1279 - 1283 (2016),
Kevin Sharp, Wim Wiegerinck, Alejandro Arias-Vasquez, Barbara Franke, Jonathan Marchini, Cornelis A. Albers, & Hilbert J. Kappen (2015): Explaining Missing Heritability Using Gaussian Process Regression,
Jonathan Marchini , Jared O'Connell, Olivier Delaneau, Kevin Sharp, Warren Kretzschmar, Gavin Band, Shane McCarthy, Desislava Petkova, Claire Bycroft, Colin Freeman, Peter Donnelly: UKBiobank Phasing and Imputation Document, (2015),
Kevin Sharp , and Magnus Rattray: Dense Message Passing for Sparse Principal Component Analysis,
in Y.W. Teh and M. Titterington (Eds.), Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS) 2010, JMLR: W&CP 9, pp 725-732 , Chia Laguna, Sardinia, Italy, May 13-15, 2010,
[pdf] [bibtex]
M. Rattray, O. Stegle, K. Sharp, and J. Winn: Inference algorithms and learning theory for Bayesian sparse factor analysis,
Journal of Physics: Conference Series, 197:012002 (10pp), 2009.
[pdf] [bibtex]



A C++ program for performing Sparse Bayesian Association Testing that can fit both full and sparse multi-trait linear mixed models. It also computes both single phenotype heritability estimates and coheritability matrices for groups of traits.
An executable binary and documentation are available here .

Oxford Phasing Server

A service for accurately phasing sequenced samples using large haplotype reference panels, such as the Haplotype Reference Consortium dataset. The backend of this server implements the phasing algorithm described in Sharp et al. 2016
The server may be accessed here .


will be available soon as a compiled CUDA/C executable for running on a Linux platform with an Nvidia CUDA-enabled Graphics Processing Unit.


Matlab code implementing the Dense Message Passing algorithm described in the AIStats 2010 paper mentioned above is available here.

This download includes everything required to reproduce the results in the paper for the dense message passing algorithm applied to synthetic data. Please see the README.txt file within the download for instructions.

To reproduce the gene expression data results and the results for the other algorithms you will also need to download data and code from the appropriate websites. Please see the README.txt file within the download for instructions about how to obtain these.

If you have any problems, please email me. I will be happy to help.


Genome-wide association studies of brain structure and function in the ~20,000 UK Biobank participants
Poster talk (ASHG 2018.)

Explaining Missing Heritability Using Gaussian Process Regression
(Bertinoro Computational Biology 2014) [Slides]

Dense Message Passing for Sparse Principal Component Analysis
(AIStats 2010). A video of the talk is available at
Dense Message Passing for Sparse Principal Component Analysis
(SuSTaIn workshop: Sparse structures: statistical theory and practice) [Slides]