Newly published in the Nature Genetics journal is the research entitled: “A Genomics England haplotype reference panel and the imputation of the UK Biobank”. The paper is authored by Dr. Sinan Shi of the Department of Statistics with Professor Simon Myers of the Myer’s Group (also at the Department of Statistics) and Sir Mark Caufield from Queen Mary University of London as co-authors.

Genomics England

Genomics England (GEL) was setup by the UK Government’s Department of Health and Social Care in 2013 to implement the 100,000 Genomes Project. It is believed that through the 100,000 Project, a foundation has been laid for personalised medicine and aided in securing the NHS’s position as one of the most advance healthcare systems in the world. [1] With the 1000,000 Genomes Project, the NHS has become the first national healthcare system with routine care that offers whole genome sequencing. [2]

In December 2018, the project of sequencing 100,000 whole genomes from around 85,000 NHS patients affected by rare disease or cancer was completed. Genomics England is now working on integrating genomics into healthcare across the NHS.

The resulting high coverage sequencing dataset by Genomics England is one of the largest genetic variations resources to have been collected in the UK.[3]

The UK Biobank & Data Sample Size 

The research by Dr. Shi and Prof. Meyers was carried out by building a haplotype reference panel with the GEL sequencing dataset. The resulting reference panel was based on 78,195 genomes and contained a diverse ethnic representation, with European and South Asian samples having the strongest representation.[4]  With this reference, the researchers were then able to assess the high accuracy of the panel by comparing it to the HRC reference panel from the 1000 Genomes Project.

In spite of the accuracy provided by the reference panel, there is the possibility that the current UK Biobank sample size may be insufficient for the detection of cancers and rare diseases with a moderate effect size.[5] This remains the case even with the whole genome sequencing.

For GEL users and registered researchers, the GEL phase haplotype reference panel is available on the GEL platform and has been widely used. Additionally, the GEL imputed UKB data has been adopted as a UKB official imputed data resource.

To see the paper on Nature Gemonics, please visit: https://www.nature.com/articles/s41588-024-01868-7