I am an Associate Professor in the Department of Statistics at the University of Oxford, and a Tutorial Fellow at University College, Oxford. I work on developing methodologies and theoretical foundations for large-scale learning problems.

Before joining the University of Oxford, I have been a Lecturer in the Computer Science department at Yale University and a Postdoctoral Associate at the Yale Institute for Network Science, hosted by Sekhar Tatikonda.
I have a Ph.D. in Operations Research and Financial Engineering from Princeton University, where I worked in probability theory under the supervision of Ramon van Handel.

Here is my Curriculum Vitae.

I am interested in the investigation of fundamental principles in **high-dimensional probability**, **statistics** and **optimization** to design computationally efficient and statistically optimal algorithms for machine learning.

In 2023, I will be serving as an Area Chair for COLT 2023.

In 2022, I served as an Area Chair for COLT 2022 and chaired the session on Advanced Theoretical Statistics at the 2022 IMS Annual Meeting in London.

As a Fellow at the Alan Turing Institute London, I organized the following meetings:

On June 28-July 2 2021, I taught the summer school Mathematics of Machine Learning.

On January 13-14 2020, I co-organized the two-day workshop Statistics and Computation.

On June 11 2018, I co-organized the one-day workshop The Interplay between Statistics and Optimization in Learning.

I am a Co-Investigator for the Imperial-Oxford StatML Centre for Doctoral Training (CDT). I am a member of the Bernoulli Society, Institute of Mathematical Statistics (IMS), and European Laboratory for Learning and Intelligent Systems (ELLIS). I am an alumnus of the Yale Institute for Network Science and Princeton Statistical Laboratory.

**A novel framework for policy mirror descent with general parametrization and linear convergence**(with C. Alfano and R. Yuan). [arXiv]**Linear convergence for natural policy gradient with log-linear policy parametrization**(with C. Alfano). [arXiv]**Exponential tail local Rademacher complexity risk bounds without the Bernstein condition**(with V. Kanade and T. Vaškevičius). [arXiv]**Comparing classes of estimators: When does gradient descent beat ridge regression in linear models?**(with D. Richards and E. Dobriban). [arXiv]**Nearly minimax-optimal rates for noisy sparse phase retrieval via early-stopped mirror descent**(with F. Wu), Information and Inference: A Journal of the IMA (to appear) [arXiv]**Implicit regularization in matrix sensing via mirror descent**(with F. Wu), Conference on Neural Information Processing Systems (NeurIPS), vol. 34, pp. 20558-20570, 2021. [proceedings] [arXiv] [code]**Distributed machine learning with sparse heterogeneous data**(with D. Richards and S. Negahban), Conference on Neural Information Processing Systems (NeurIPS), vol. 34, pp. 18008-18020, 2021. [proceedings] [arXiv]**On optimal interpolation in linear regression**(with E. Oravkin), Conference on Neural Information Processing Systems (NeurIPS), vol. 34, pp. 29116-29128, 2021. [proceedings] [arXiv] [code]**Time-independent generalization bounds for SGLD in non-convex settings**(with T. Farghly), Conference on Neural Information Processing Systems (NeurIPS), vol. 34, pp. 19836-19846, 2021. [proceedings] [arXiv]**Hadamard Wirtinger flow for sparse phase retrieval**(with F. Wu), International Conference on Artificial Intelligence and Statistics (AISTATS), Proceedings of Machine Learning Research (PMLR), vol. 130, pp. 982-990, 2021. Oral presentation. [proceedings] [arXiv] [code]**A continuous-time mirror descent approach to sparse phase retrieval**(with F. Wu), Conference on Neural Information Processing Systems (NeurIPS), vol. 33, pp. 20192-20203, 2020. Spotlight presentation. [proceedings] [arXiv] [code]**The statistical complexity of early stopped mirror descent**(with V. Kanade and T. Vaškevičius), Conference on Neural Information Processing Systems (NeurIPS), vol. 33, pp. 253-264, 2020. Spotlight presentation. [proceedings] [arXiv] [code]**Decentralised learning with random features and distributed gradient descent**(with D. Richards and L. Rosasco), International Conference on Machine Learning (ICML), Proceedings of Machine Learning Research (PMLR), vol. 119, pp. 8105-8115, 2020. [proceedings] [arXiv] [code]**Graph-dependent implicit regularisation for distributed stochastic subgradient descent**(with D. Richards), Journal of Machine Learning Research (JMLR), vol. 21, no. 34, pp. 1-44, 2020. [journal] [arXiv] [code]**Implicit regularization for optimal sparse recovery**(with V. Kanade and T. Vaškevičius), Conference on Neural Information Processing Systems (NeurIPS), vol. 32, pp. 2972-2983, 2019. [proceedings] [arXiv] [code]**Optimal statistical rates for decentralised non-parametric regression with linear speed-up**(with D. Richards), Conference on Neural Information Processing Systems (NeurIPS), vol. 32, pp. 1216-1227, 2019. [proceedings] [arXiv]**Decentralized cooperative stochastic bandits**(with D. Martínez-Rubio and V. Kanade), Conference on Neural Information Processing Systems (NeurIPS), vol. 32, pp. 4529-4540, 2019. [proceedings] [arXiv] [code]**Locality in network optimization**(with S. Tatikonda), IEEE Transactions on Control of Network Systems, vol. 6, no. 2, pp. 487-500, 2019. [journal] [arXiv]**A new approach to Laplacian solvers and flow problems**(with S. Tatikonda), Journal of Machine Learning Research (JMLR), vol. 20, no. 36, pp. 1-37, 2019. [journal] [arXiv]**Accelerated consensus via Min-Sum Splitting**(with S. Tatikonda), Conference on Neural Information Processing Systems (NIPS), vol. 30, pp. 1374-1384, 2017. [proceedings] [arXiv] [poster]**Decay of correlation in network flow problems**(with S. Tatikonda), 50th Conference on Information Sciences and Systems (CISS), pp. 169-174, 2016. [proceedings] [pdf]**Fast mixing for discrete point processes**(with A. Karbasi), 28th Conference on Learning Theory (COLT), pp. 1480-1500, 2015. [proceedings] [arXiv] [poster]**Can local particle filters beat the curse of dimensionality?**(with R. van Handel), Annals of Applied Probability, vol. 25, no. 5, pp. 2809-2866, 2015. [journal] [arXiv]**Phase transitions in nonlinear filtering**(with R. van Handel), Electronic Journal of Probability, vol. 20, no. 7, pp. 1-46, 2015. [journal] [arXiv]**Comparison theorems for Gibbs measures**(with R. van Handel), Journal of Statistical Physics, vol. 157, pp. 234-281, 2014. [journal] [arXiv]**Nonlinear filtering in high dimension**, Ph.D. thesis, Princeton University, 2014. [pdf]

**Implicit regularization in statistical learning: An overview and some recent results**, Physics of Machine Learning Workshop, Università degli Studi di Padova, Asiago, September 2022.**Concentration without Bernstein**, Advanced Theoretical Statistics session, IMS Annual Meeting, London, June 2022.**Sharp Excess Risk Bounds without the Bernstein Condition: An Algorithmic Viewpoint**, BIDSA Seminar Series, Department of Decision Sciences, Bocconi University, April 2022.**Sharp Excess Risk Bounds without the Bernstein Condition: An Algorithmic Viewpoint**, CDSML Seminar Series, Department of Mathematics, National University of Singapore, March 2022.**The Statistical Complexity of Early-Stopped Mirror Descent**, Statistical Methods in Machine Learning, Bernoulli-IMS One World Symposium 2020, August 2020. [video]**The Statistical Complexity of Early-Stopped Mirror Descent**, Probability Seminar, Division of Applied Mathematics, Brown University, May 2020.**Statistically and Computationally Optimal Estimators for Sparse Recovery and Decentralized Regression**, Adobe Research, San Jose, December 2019.**Implicit Regularization for Optimal Sparse Recovery**, Information Systems Lab (ISL) Colloquium, Stanford University, December 2019.**On the Interplay between Statistics, Computation and Communication in Decentralised Learning**, Decision and Control Systems, KTH, October 2019.**Implicit Regularization for Optimal Sparse Recovery**, Probability and Mathematical Statistics seminar, Department of Mathematics, KTH, October 2019.**Implicit Regularization for Optimal Sparse Recovery**, London Machine Learning Meetup, September 2019.**Implicit Regularization for Optimal Sparse Recovery**, Theory, Algorithms and Computations of Modern Learning Systems workshop, DALI/ELLIS, September 2019.**On the Interplay between Statistics, Computation and Communication in Decentralised Learning**, Optimization and Statistical Learning workshop (OSL 2019), Les Houches School of Physics. [slides]**On the Interplay between Statistics, Computation and Communication in Decentralised Learning**, School of Mathematics, University of Bristol, March 2019.**On the Interplay between Statistics, Computation and Communication in Decentralised Learning**, Algorithms & Computationally Intensive Inference Seminar, University of Warwick, February 2019.**Multi-Agent Learning: Implicit Regularization and Order-Optimal Gossip**, Theory and Algorithms in Data Science, The Alan Turing Institute, August 2018.**Multi-Agent Learning: Implicit Regularization and Order-Optimal Gossip**, Statistical Scalability Programme, Isaac Newton Institute, June 2018.**Multi-Agent Learning: Implicit Regularization and Order-Optimal Gossip**, Statistics Seminar Series, Department of Decision Sciences, Bocconi University, May 2018.**Distributed and Decentralised Learning: Generalisation and Order-Optimal Gossip**, Amazon Berlin, April 2018.**Locality and Message Passing in Network Optimization**, Workshop on Optimization vs Sampling, The Alan Turing Institute, February 2018.**Accelerated Consensus via Min-Sum Splitting**, Statistics Seminar, University of Cambridge, November 2017.**Accelerating message-passing using global information**, OxWaSP Workshop, University of Warwick, October 2017.**Accelerating message-passing using global information**, StatMathAppli 2017, Statistics Mathematics and Applications, Fréjus, September 2017.**Accelerated Min-Sum for consensus**, Large-Scale and Distributed Optimization, LCCC Workshop, Lund University, June 2017.**Message-passing in convex optimization**, WINRS conference, Brown University, March 2017.**Min-Sum and network flows**, Workshop on Optimization and Inference for Physical Flows on Networks, Banff International Research Station, March 2017.**Locality and message-passing in network optimization**, DISMA, Politecnico di Torino, January 2017.**Locality and message-passing in network optimization**, LIDS Seminar Series, MIT, November 2016.**Locality and message-passing in network optimization**, Probability Seminar, Division of Applied Mathematics, Brown University. November 2016.**Message-passing in network optimization**, YINS Seminar Series, Yale University, November 2016.**Tractable Bayesian computation in high-dimensional graphical models**, Mathematical Sciences Department, IBM Thomas J. Watson Research Center, June 2016.**From sampling to learning submodular functions**, 2016 New England Statistics Symposium (NESS), Yale University, April 2016.**Scale-free sequential Monte Carlo**, Seminar on particle methods in Statistics, Statistics Department, Harvard University, April 2016.**Decay of correlation in network flow problems**, 50th Annual Conference on Information Sciences and Systems (CISS 2016), Princeton University, March 2016.**Locality in network optimization**, INFORMS, Philadelphia, November 2015.**Local algorithms in high-dimensional models**, Statistics Department, University of Oxford, September 2015.**Killed random walks and graph Laplacians: local sensitivity in network flow problems**, Yale Probabilistic Networks Group seminar, Statistics Department, Yale University, September 2015.**Decay of correlation in graphical models; algorithmic perspectives**, School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne, August 2015.**Fast mixing for discrete point processes**, 28th Annual Conference on Learning Theory (COLT 2015), Université Pierre et Marie Curie, July 2015. [poster] [video]**Filtering compressed signal dynamics in high dimension**, 45th Annual John H. Barrett Memorial Lectures, University of Tennessee, May 2015.**On the role of the Hessian of submodular functions**, Yale Probabilistic Networks Group seminar, Statistics Department, Yale University, April 2015.**Submodular functions, from optimization to probability**, Probability Theory and Combinatorial Optimization, The Fuqua School of Business, Duke University, March 2015.**Estimating conditional distributions in high dimension**, Applied Mathematics seminar, Yale University, October 2014.**Nonlinear filtering in high dimension**, Yale Probabilistic Networks Group seminar, Statistics Department, Yale University, September 2014.**Particle filters and curse of dimensionality**, Monte Carlo Inference for Complex Statistical Models workshop, Isaac Newton Institute for Mathematical Sciences, University of Cambridge, April 2014. [slides] [video]**Particle filters and curse of dimensionality**, Cambridge Machine Learning Group, University of Cambridge, February 2014.**New phenomena in nonlinear filtering**, Yale Probabilistic Networks Group seminar, Statistics Department, Yale University, February 2014.**Filtering in high dimension**, Cornell Probability Summer School, Cornell University, July 2013.

At the **University of Oxford**, I regularly organize reading groups on learning theory and statistical optimization:

In Spring 2023, together with Ciara Pike-Burke, I will lead a doctoral course on online leaning, bandits, and reinforcement learning. Confirmed speakers are: Nicolò Cesa-Bianchi (University of Milan), Tor Lattimore (DeepMind), Gergely Neu (Universitat Pompeu Fabra).

In 2022 (and in 2023), I served (will serve) as a research supervisor for UNIQ+ DeepMind summer interns.

Since 2018, I have been designing and teaching Algorithmic Foundations of Learning, for which I received the 2019 Oxford MPLS Teaching Award.

In Spring 2021, I taught Simulation and Statistical Programming. In Spring 2018, I taught Advanced Simulation Methods.

Since 2017, I have been teaching probability theory, statistics, and graph theory as part of my tutorial duties at University College Oxford.

At **Yale University**, in Fall 2016 I served as the Head Instructor for CS50 — Introduction to Computing and Programming — taught jointly with Harvard University.
This was a coverage on the Yale Daily News. Here is the intro class in Machine Learning and Python, or its VR version.
I was a member of the Yale Postdoctoral Association, and for three years in a row, from 2015 to 2017, I organized the Julia Robinson Mathematics Festival at Yale, a celebration of ideas and problems in mathematics that enable junior high and high school students to explore fun math in a non-competitive setting.

At **Princeton University**, in 2013 I received the Excellence in Teaching Award from the Princeton Engineering Council while serving as head teaching assistant for ORF 309 (Probability and Stochastic Systems) at Princeton University. I was also a fellow of the McGraw Center for Teaching and Learning at Princeton University.