Chapter 16 Causal biases

Here we give a brief description of each of the main types of bias that you may encounter when performing a causal analysis.

16.1 Confounding bias

Sometimes called omitted variable bias by econometricians. If a pre-treatment variable that is related to both the treatment and the outcome is not included in the analysis, this creates a spurious dependence between the two. This is the most common and problematic form of bias in causal studies using observational data, and perhaps best explains the famous statement ‘correlation does not imply causation’.

16.2 Selection bias

Not far behind confounding, this occurs if one chooses subjects in a manner that is related to their outcome (and perhaps also their treatment status). Sometimes this is deliberate: for example, in the case of a rare diseases it is generally impractical to run a prospective study, so a case-control study in which patients are selected given their disease status is very common.

Graphical illustration of selection bias. In this case, if we sample students from US colleges, then we will only interview those who are currently enrolled.  In order to reach this state, they must either have good grades or be good enough at sport to win a scholarship.  Even if these two attributes are independent in the general population, this means that they will appear to be negatively correlated in our sample.

Figure 16.1: Graphical illustration of selection bias. In this case, if we sample students from US colleges, then we will only interview those who are currently enrolled. In order to reach this state, they must either have good grades or be good enough at sport to win a scholarship. Even if these two attributes are independent in the general population, this means that they will appear to be negatively correlated in our sample.

16.3 Bias due to conditioning on a mediator

As well as omitted-variable bias, it is possible to induce bias by controlling for a variable that actually mediates the causal relationship of interst. In Figure 16.2(i) for example, conditioning on \(C\) would entirely block the relationship between \(A\) and \(Y\).

Conversely, conditioning on a mediator may open a spurious path when there is actually no causal relationship at all. This is illustrated in 16.2(ii), where adjusting for \(C\) opens up the path and induces a non-causal association.

Graphical illustrations of the danger of conditioning on a mediator: in (i) conditioning on a mediator blocks the entire causal effect; in (ii) there is no causal effect, but conditioning on a mediator opens up a spurious path.

Figure 16.2: Graphical illustrations of the danger of conditioning on a mediator: in (i) conditioning on a mediator blocks the entire causal effect; in (ii) there is no causal effect, but conditioning on a mediator opens up a spurious path.