Chapter 23 Meta-Learners

A meta-learner is a generic algorithm for learning a causal effect.

S-learner: (S=‘single’). Learn a (flexible) regression model for \(Y | {\boldsymbol X}, T\).
T-learner: (T=‘two’). Learn separate (flexible) regression models for \(Y | {\boldsymbol X}, T=0\) and \(Y | {\boldsymbol X}, T=1\).
X-learner: (X=‘cross’). Start as for the T-learner, but then estimate ITEs as \[\begin{align*} D_i &= \hat\mu_1({\boldsymbol X}_i) - Y_i & D_i &= Y_i - \hat\mu_0({\boldsymbol X}_i) \end{align*}\] for control and treated individuals respectively. Use these to build separate ITE models (\(\tau_a\)), and then write \[\begin{align*} \mathop{\mathrm{CATE}}({\boldsymbol x}) &= g({\boldsymbol x}) \cdot \tau_0({\boldsymbol x}) + (1-g({\boldsymbol x})) \cdot \tau_1({\boldsymbol x}) \end{align*}\] for an arbitrary weight function \(g\). A good choice is an estimate of the propensity score (Künzel et al. 2019).
R-learner: (R=‘Residual’ from Robinson (1988), also c.f. Frisch-Waugh-Lovell Theorem idea)

First regress both \(A\) and \(Y\) on \({\boldsymbol X}\), and compute residuals \(R_A\), \(R_Y\).

Then regress \(R_Y\) on \(R_A\) using (e.g.) least squares (Nie and Wager 2021).
DR-learner: (DR=‘double robust’, Kennedy (2023))

First use part of sample to estimate nuisance parameters \(\pi, \mu_0, \mu_1\).

Now estimate pseudo-outcome \[\begin{align*} \widetilde{Y} &= \frac{A- \hat{\pi}({\boldsymbol X})}{\hat{\pi}({\boldsymbol X}) \cdot (1-\hat{\pi}({\boldsymbol X}))} \left\{Y - \mu_A({\boldsymbol X})\right\} + \mu_1({\boldsymbol X}) - \mu_0({\boldsymbol X}), \end{align*}\] in the remainder, and regress on covariates to obtain \(\widehat\mathbb{E}[\widetilde{Y} \mid {\boldsymbol X}={\boldsymbol x}]\).

References

Kennedy, Edward H. 2023. “Towards Optimal Doubly Robust Estimation of Heterogeneous Causal Effects.” Electronic Journal of Statistics 17 (2): 3008–49.

Künzel, Sören R, Jasjeet S Sekhon, Peter J Bickel, and Bin Yu. 2019. “Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning.” Proceedings of the National Academy of Sciences 116 (10): 4156–65.

Nie, Xinkun, and Stefan Wager. 2021. “Quasi-Oracle Estimation of Heterogeneous Treatment Effects.” Biometrika 108 (2): 299–319.

Robinson, Peter M. 1988. “Root-\(N\)-Consistent Semiparametric Regression.” Econometrica 56 (4): 931–54.