DPhil in Statistics student
I'm a third year DPhil student. I've long been interested in AI safety problems like specification-gaming, obedience, and user-manipulation. Increasingly, I've noticed that causal models can be useful for modelling these problems, and are interesting in their own right. I collaborate with the Causal Incentives Working Group and have previously worked at the Future of Humanity Institute, and at DeepMind.
- AI safety
- graphical models of agents, influence diagrams
- structural causal models, latent variable models, causal structure learning