Instrumental variables

Part 6 — Causal inference for researchers

Learning objectives

Define an INSTRUMENT Z and state the three required conditions: relevance, exogeneity, exclusion
Apply the Wald estimator: Cov(Y, Z) / Cov(T, Z)
Apply two-stage least squares (2SLS) for IV regression with covariates
Diagnose WEAK INSTRUMENTS via the first-stage F-statistic (Stock-Yogo F > 10)
Recognise IV as the canonical observational tool when unobserved confounding is suspected

Propensity-score methods (§6.4) require all confounders to be observed. When you suspect an UNOBSERVED confounder U that affects both T and Y, no back-door adjustment recovers the true ATE. The instrumental-variable approach tackles this head-on: find a variable Z that affects T but is otherwise unrelated to Y — a "natural experiment" embedded in observational data — and use Z's variation to identify the causal effect.

The three IV conditions

Relevance: Z is correlated with T (i.e., Cov(Z, T) ≠ 0). Empirically testable: regress T on Z; the F-statistic measures relevance strength.
Exclusion: Z affects Y ONLY through T. No direct arrow Z → Y, no third pathway. Not testable from data; requires substantive defence (knowledge of the data-generating process).
Exogeneity: Z is independent of the unobserved confounder U. Sometimes called "the instrument is as good as random". Also not directly testable.

If all three hold, the IV estimator identifies the LATE — the LOCAL AVERAGE TREATMENT EFFECT for the "compliers" (units whose T responds to Z).

The Wald estimator

For binary or continuous Z, T, Y:

\hat{\tau}_{\text{IV}} = \frac{\mathrm{Cov}(Y, Z)}{\mathrm{Cov}(T, Z)} = \frac{E[Y \mid Z=1] - E[Y \mid Z=0]}{E[T \mid Z=1] - E[T \mid Z=0]}.

The intuition: Z's effect on Y (numerator) is mediated entirely through Z's effect on T (denominator, since exclusion rules out other paths). Dividing gives the per-unit-T effect on Y, which IS τ.

Two-stage least squares (2SLS)

For continuous Z and regression with covariates:

Stage 1: Regress T on Z (and X if applicable): $T = \pi_0 + \pi_1 Z + \boldsymbol{\pi}_X^T \mathbf{X} + v$ . Obtain fitted T̂.
Stage 2: Regress Y on T̂ (the IV-projected treatment), X: $Y = \beta_0 + \beta_1 \hat{T} + \boldsymbol{\beta}_X^T \mathbf{X} + \varepsilon$ . The coefficient $\beta_1$ on T̂ is the IV estimator of τ.

Equivalent to the Wald estimator in the just-identified single-instrument case; generalises to over-identified models with multiple instruments.

Weak-instrument diagnostics

If Cov(Z, T) is small, the IV estimator is UNSTABLE — small variations in the numerator give huge variations in the estimate. Stock-Yogo (2005) showed: as relevance shrinks, the IV estimator is BIASED TOWARD OLS and its SE is too small. Practical thresholds:

First-stage F > 10: strong instrument; IV reliable.
F in 5–10: weak; report with caveats.
F < 5: very weak; IV is biased and the confidence intervals are wrong.

Modern best practice: report the first-stage F whenever you use IV. If weak, use weak-instrument-robust inference (Anderson-Rubin CIs, conditional likelihood-ratio tests).

Famous IV examples

Angrist (1990): Vietnam draft lottery number as instrument for military service → causal effect on earnings.
Card (1995): Distance to nearest college as instrument for years of schooling → return to education.
Angrist & Lavy (1999): Maimonides Rule (class-size cap at 40) as discontinuous instrument for class size → effect on test scores.
Mendelian randomisation: genetic variants as instruments for exposures (e.g., cholesterol level) → causal effects on disease.

The hardest part: defending the exclusion restriction

Relevance is data-testable. Exogeneity is sometimes defensible by design (a lottery, random weather, geographic distance). EXCLUSION is the hardest — you must argue, substantively, that Z affects Y ONLY through T. Critics of IV studies almost always attack the exclusion restriction. Modern good practice: pre-register the IV strategy; cite the substantive reason for exclusion; conduct sensitivity analyses (Conley et al. 2012 plausibly-exogenous IV bounds).

Try it

Start with true τ = 1.5, relevance = 1.0, U confounding = 1.0. Notice OLS is BIASED upward (around 2.0-2.3) due to confounding; IV recovers τ ≈ 1.5. First-stage F is large — strong instrument.
Crank U confounding to 3.0. OLS bias balloons; IV stays approximately on truth. The instrument's value is precisely in unconfounded identification.
Set relevance to 0.10. The first-stage F drops dramatically; IV estimate becomes ERRATIC across re-samples. Weak instrument = unreliable IV.
Set τ = 0, U confounding = 1.0. OLS finds a spurious effect; IV correctly reports near 0. The signal IV recovers is the true causal effect — when truth is zero, IV says zero.
Set relevance to 2.0 (very strong instrument). IV estimate is very tight. The trade-off between relevance and exclusion: stronger Z → T relationships tend to come with stronger risk of Z having other effects on Y.

A researcher uses RAIN on the day of a survey as an instrument for survey response rate to estimate the causal effect of survey participation on health outcomes. The exclusion restriction would require rain to NOT affect health outcomes except through response rate. Is this plausible?

What you now know

Instrumental variables tackle unobserved confounding by finding an "as-if-random" source of variation in T. The three conditions — relevance, exogeneity, exclusion — must all hold. The Wald estimator is the simplest IV: Cov(Y,Z)/Cov(T,Z). 2SLS generalises to covariates. Weak instruments (F < 10) make IV unreliable. The hardest part is defending exclusion — your IV strategy lives or dies on whether reviewers accept that Z affects Y ONLY through T. §6.6 turns to regression discontinuity, where a SHARP CUTOFF in treatment assignment plays the role of the instrument.

References

Angrist, J.D., Imbens, G.W., Rubin, D.B. (1996). "Identification of causal effects using instrumental variables." JASA 91(434), 444–455. (The LATE paper; defines the local-compliers interpretation.)
Stock, J.H., Yogo, M. (2005). "Testing for weak instruments in linear IV regression." In Identification and Inference for Econometric Models, Cambridge University Press.
Imbens, G.W. (2014). "Instrumental variables: An econometrician's perspective." Statistical Science 29(3), 323–358.
Angrist, J.D., Pischke, J.S. (2009). Mostly Harmless Econometrics. Princeton University Press. (Chapter 4 develops IV applied econometrics.)
Conley, T.G., Hansen, C.B., Rossi, P.E. (2012). "Plausibly exogenous." Review of Economics and Statistics 94(1), 260–272.