I am a fifth-year PhD student in the Oxford Department of Statistics working on statistical genetics. I am fortunate to be a member of Pier Palamara’s group and am currently investigating methods that infer genealogies for a set of genetic samples. Due to genetic recombination during meiosis, the pattern of these genealogies shifts as one scans across the genome. Inference of these local genealogies provides a rich picture of genetic relatedness and can shed light on biological processes from natural selection to the genetic risk factors underlying disease.
At Oxford, I am supported by the Clarendon Scholarship and am a member of St. John’s College. I am co-supervised by Geoff Nicholls, and was also previously supervised by Jonathan Marchini. Before coming to Oxford, I worked for two years as a research engineer at DeepMind in London.
Interests. Population Genetics, Machine Learning
Biobank-Scale Ancestral Recombination Graph Inference
In population genetics, the ancestral recombination graph (ARG) captures the history of coalescence, recombination, and mutation events that gives rise to observed genetic data. We developed a method, ARG-Needle, that leverages coalescent modeling for ascertained genotyping array data to infer accurate, biobank-scale ARGs from SNP arrays. We also developed a framework for performing mixed-model association of unobserved variation implied by an inferred ARG. Using these methods, we inferred the ARG of 337,464 individuals in the UK Biobank and performed genealogy-based association of 7 complex traits, recapitulating as well as detecting complementary associations compared to reference-based imputation. As these methods only require SNP array data, we anticipate they will be particularly relevant for populations that are currently undersequenced.
Mathematics of Linear Mixed Models
My first PhD project focused on improving linear mixed model association in genetics. Standard inference under the mixed model does not scale to modern genetic datasets, so my PhD supervisors and I were looking to build on past methods like BOLT-LMM and LDpred to further improve the scalability of mixed model association. Although I have put the project on pause, my work in this area led me to write a set of expository notes on the mathematics of linear mixed models. (In progress, last modified February 2020.)
Coconuts and Islanders: A Statistics-First Guide to the Boltzmann Distribution
An arXiv writeup presenting the Boltzmann distribution in what I hope is an accessible and intuitive way. I learned this approach from my father and the notes are dedicated to his memory.
Random Graphs and Giant Components
An R Markdown blog post introducing the Erdős-Rényi random graph and giant component. I tried to build intuition through figures and animations, but have also linked to further reading on random graphs. Done in my free time during my PhD.
Biobank-scale inference of ancestral recombination graphs enables genealogy-based mixed model association of complex traits,
Brian C. Zhang, Arjun Biddanda, and Pier Francesco Palamara
Coconuts and Islanders: A Statistics-First Guide to the Boltzmann Distribution,
The Kinetics Human Action Video Dataset,
Will Kay, João Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, Andrew Zisserman
Vector-based navigation using grid-like representations in artificial agents,
Andrea Banino, Caswell Barry, Benigno Uria, Charles Blundell, Timothy Lillicrap, Piotr Mirowski, Alexander Pritzel, Martin Chadwick, Thomas Degris, Joseph Modayil, Greg Wayne, Hubert Soyer, Fabio Viola, Brian Zhang, Ross Goroshin, Neil Rabinowitz, Razvan Pascanu, Charlie Beattie, Stig Petersen, Amir Sadik, Stephen Gaffney, Helen King, Koray Kavukcuoglu, Demis Hassabis, Raia Hadsell, Dharshan Kumaran