CHARMNET: CHARacterising Models for NETworks


For enquiries please email reinert@stats.ox.ac.uk

What is CHARMNET about?

Networks have emerged as useful tool to represent and analyse complex data sets. These data sets appear in many contexts - for example, biological networks are used to represent the interplay of agents within a cell, social networks represent interactions between individuals or social entities such as websites referring to other websites, trade networks reflect trade relationships between countries.

Due to the complexity of the data which they represent, networks pose considerable obstacles for analysis. Typically the standard statistical framework of independent observations no longer applies - networks are used to represent the data precisely because they are often not independent of each other. While each network itself can be viewed as an observation, usually there are no independent observations of the whole network available.

To understand networks, probabilistic models can be employed. The behaviour of networks which are generated from such models can then be studied with tools from applied probability. Even relatively simple models provide challenges in their analysis, with more realistic complex models often out of reach of a rigorous mathematical treatment.

Hence, depending on the network behaviour of interest, it may be reasonable to approximate a complex model with a simpler model. Assessing the error in such an approximation is crucial to determine whether the approximation is suitable. This project will derive characterisations of network models which relate to a common underlying process. This common underlying process will then allow to compare models through comparing their characterisations.

Based on such comparisons, approximate test procedures can be derived by first using the simpler model to obtain the distribution of the test statistic under the null hypothesis and then taking the approximation error into account. In practice, for a given data set, a model would be fitted to the data. This fitting process introduces some variability which in itself will result in some deviations from the model. Using tools from theoretical statistics as well as applied probability, these deviations can again be assessed, with an explicit error term.

What is the plan?

Key questions which this project addresses are

(1) What is the expected behaviour of complex models for networks? Once the expected behaviour is understood, deviations from it can be exploited to detect anomalies in networks.
(2) How can networks such as infrastructure networks and reporting networks be designed to achieve efficiency and resilience? Understanding the behaviour of models for networks can guide the design of such networks.
(3) How can the interconnectedness of people, things and data be taken into account when drawing statistical conclusions? Tests for assessing models which could include explanatory variables as parameters will be tackled in this project. To understand networks, probabilistic models can be employed. The behaviour of networks which are generated from such models can then be studied with tools from applied probability. Even relatively simple models provide challenges in their analysis, with more realistic complex models often out of reach of a rigorous mathematical treatment.

This project will derive characterisations of network models which relate to a common underlying process. This common underlying process will then allow to compare models through comparing their characterisations. Reversing this approach, graph neural networks from the area of Artificial Intelligence have been used as anomaly detection methods in networks. These graph neural networks are characterised through their underlying process. The project will exploit the novel observation that this characterisation can then be compared to the characterisation of network models.

Based on such comparisons, approximate test procedures can then be derived by using the simpler model to obtain the distribution of the test statistic under the null hypothesis and then take the approximation error into account. In practice, for a given data set, a model would be fitted to the data. This fitting process introduces some variability which in itself will result in some deviations from the model. Using tools from theoretical statistics as well as applied probability, these deviations can again be assessed, with an explicit error term.

This approach will forge a new connection between probability and AI through the analysis of neural networks as well as providing theoretical underpinning for network analysis.

Group members

Gholamali Aminian
Stefanos Bennett
Jase Clarkson
Anum Fatima
Yixuan He
Yutong Lu
Anastasia Mantziou
Piotr Sliwa
Tadas Temcinas
Wenkai Xu
Roxanne Zhang

Some publications from our group on Google scholar

Gesine Reinert
Wenkai Xu
Yixuan He


This is a 5-year project which is funded by EPSRC, grant reference EP/T018445/1. It started on April 1, 2021.