Randomised controlled trials, designed right

Part 6 — Causal inference for researchers

Learning objectives

Apply random assignment to break the dependence between T and (Y(0), Y(1))
Use blocked (stratified) randomisation to guarantee covariate balance
Estimate the ATE as a simple difference in means and report its SE
Apply Lin (2013) regression adjustment for variance reduction without sacrificing unbiasedness
Pre-register the primary analysis to prevent specification searches
Identify common RCT pathologies: non-compliance, attrition, contamination, Hawthorne effect, outcome switching

The randomised controlled trial is the gold standard for causal inference. Random assignment severs the dependence between treatment and unmeasured confounders — the property that turns observed association into the ATE. But the gold standard is not the easy standard. Real RCTs face design choices, finite-sample randomness, and a roster of pathologies that can wreck a trial that gets the randomisation right. §6.2 covers the design and analysis honestly.

Why randomisation identifies the ATE

If treatment $T_i$ is assigned by a coin flip (or any fair mechanism independent of unit characteristics), then $T_i$ is statistically independent of $(Y_i(0), Y_i(1))$ . Under independence:

E[Y_i \mid T_i = 1] - E[Y_i \mid T_i = 0] = E[Y_i(1) \mid T_i = 1] - E[Y_i(0) \mid T_i = 0] = E[Y_i(1)] - E[Y_i(0)] = \tau_{\text{ATE}}.

The observed difference in means equals the ATE up to sampling noise. No modelling assumption about confounders. No back-door criterion to satisfy. The coin flip does the work.

The simple ATE estimator

For $N_T$ treated units and $N_C$ controls:

\hat{\tau} = \frac{1}{N_T} \sum_{i: T_i = 1} Y_i - \frac{1}{N_C} \sum_{i: T_i = 0} Y_i.

Unbiased estimator of ATE. Standard error from two-sample variance formula or its bootstrap analogue. For a single binary outcome, this is the proportion test from §2.3.

Block (stratified) randomisation

Pure simple randomisation can give imbalanced groups by chance — especially with small N. If 60% of the treated arm but 40% of the control arm happens to be female, the male/female imbalance adds noise (and creates suspicion of "manipulated randomisation" even when it's honest).

BLOCKED randomisation: stratify pre-randomly on key covariates (sex, age band, baseline severity), then randomise WITHIN each stratum. Guarantees balance on the blocked variables. Standard in modern RCTs — you almost never see "unstratified" randomisation in a high-quality trial.

Lin's (2013) regression adjustment

For variance reduction, regress the outcome on the treatment AND pre-treatment covariates, including treatment × covariate interactions:

Y_i = \beta_0 + \beta_1 T_i + \boldsymbol{\gamma}^T (\mathbf{X}_i - \bar{\mathbf{X}}) + \boldsymbol{\delta}^T (\mathbf{X}_i - \bar{\mathbf{X}}) T_i + \varepsilon_i.

The coefficient $\beta_1$ is the regression-adjusted ATE estimator. Under randomisation, it's still UNBIASED regardless of how good the covariate model is. The variance is at-or-better-than the simple difference of means. Robust to model mis-specification — this is Lin's key contribution (in response to Freedman's 2008 critique of mis-specified regression in RCTs).

Pre-registration

The biggest threat to RCT credibility is post-hoc analysis: deciding the primary outcome AFTER seeing the data. Pre-registration locks the analysis plan BEFORE looking at results — primary outcome, statistical test, multiple-comparison correction, subgroup analyses, missing-data handling. Major journals + most funding bodies now require it.

Common RCT pathologies

Non-compliance: treated subjects don't take the treatment; controls cross over and obtain it. Intent-to-treat (ITT) analysis — analyse units according to ASSIGNED treatment, regardless of compliance — preserves randomisation's benefits but underestimates the per-protocol effect. To recover the per-protocol effect: use the IV machinery of §6.5 with randomised assignment as instrument.
Attrition: outcomes missing for some units. If missing AT RANDOM (independent of potential outcomes), no bias. If non-random (dropout related to outcome), bias.
Contamination: control units inadvertently receive treatment via spillover. Use cluster RCTs to isolate.
Hawthorne effect: knowing you're in a study changes behaviour. Mitigated by blinding (single-blind: subjects don't know; double-blind: subjects AND researchers don't know).
Outcome switching: changing the primary outcome after seeing data. Prevented by pre-registration.

Try it

Start with N = 40, simple randomisation, true ATE = 5. Hit re-randomise several times. Note that stratum-A fractions in the two arms DRIFT — some seeds give 0.65 / 0.35 imbalance, others give 0.45 / 0.55. The imbalance ranges widely in small N.
Switch to block randomisation. Re-randomise several times. Stratum-A fractions are now EXACTLY 0.5 / 0.5 in both arms (within rounding). Blocking eliminates chance imbalance entirely.
With block randomisation and N = 40, the ATE estimate is closer to the true ATE on average than under simple randomisation — the imbalance noise was costing accuracy. Cycle through re-randomisations and confirm.
Crank N to 200 under simple randomisation. The imbalance shrinks (large-sample law of large numbers). The benefit of blocking is greatest in small samples.
Set true ATE = 0. Under simple randomisation at N = 20, observed effect can be ±3 just from imbalance. Under blocked: closer to 0. Without blocking and with small N, you risk reporting a spurious effect — exactly why blocking matters for high-stakes trials.

A drug trial randomises 40 patients but gets unlucky: 14 of the 20 treated are male, only 6 of 20 controls are male. Sex affects baseline outcome heavily. Would Lin (2013) regression adjustment INCLUDING sex as a covariate fix this, or do you need to re-randomise?

What you now know

The RCT is the canonical causal-inference design. Randomisation breaks the link between treatment and confounders; the simple difference-in-means is an unbiased ATE estimator. Blocked randomisation guarantees covariate balance and reduces variance, especially in small samples. Lin (2013) regression adjustment further reduces variance without sacrificing unbiasedness. Pre-registration is the standard guard against specification searches. Non-compliance, attrition, contamination, and outcome switching are the practical pathologies — each has a known mitigation. §6.3 turns to OBSERVATIONAL data, where the coin flip isn't available and identification requires assumptions about confounders.

References

Fisher, R.A. (1925). Statistical Methods for Research Workers. Edinburgh: Oliver and Boyd. (The foundational RCT-analysis text; introduced randomisation as the basis of inference.)
Lin, W. (2013). "Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique." Annals of Applied Statistics 7(1), 295–318. (The modern justification for covariate-adjusted RCT analysis.)
Athey, S., Imbens, G.W. (2017). "The econometrics of randomized experiments." In Handbook of Economic Field Experiments Vol. 1, Elsevier. (Comprehensive modern survey of RCT design and analysis.)
Imbens, G.W., Rubin, D.B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press. (Chapters 6–7 develop the RCT analysis machinery with full Neyman-Rubin formalism.)
Cox, D.R. (1958). Planning of Experiments. New York: Wiley. (Classic treatise on experimental design including blocking, randomisation schemes, factorial designs.)