The law of large numbers
Learning objectives
- State the strong and weak LLN
- Recognise the SLLN convergence rate: deviations from μ shrink at 1/√n
- Identify when LLN fails: distributions with infinite mean (Cauchy, certain Pareto)
- Apply LLN to justify sample means as estimators
- Distinguish almost-sure (strong) from in-probability (weak) convergence
The Law of Large Numbers is statistics' first big idea: the SAMPLE AVERAGE of i.i.d. observations converges to the POPULATION MEAN as the sample size grows. It is what makes statistical inference possible — sample-based estimators tell us something about populations.
Two forms: weak and strong
Let be i.i.d. with finite mean .
- Weak LLN: — convergence in probability. For every , .
- Strong LLN: — convergence almost surely. With probability 1, the path .
SLLN ⇒ WLLN; not vice versa. SLLN gives stronger guarantees: ALMOST EVERY sample path eventually settles near μ. WLLN allows wild paths so long as the probability of being far drops with n.
Convergence rate: the 1/√n rule
For finite-variance distributions: . So the sample mean concentrates at rate 1/√n. To halve the error, you need 4× the data. To shrink by 10×, you need 100× the data.
This is the WHY behind: large studies (clinical trials, surveys) need large n; precision is expensive.
When LLN fails: pathological distributions
The LLN requires finite mean. For distributions with INFINITE OR UNDEFINED mean:
- Cauchy(0, 1): the PDF has heavy tails that make diverge. Sample means do NOT converge — they have the SAME distribution as a single sample.
- Pareto with shape α ≤ 1: heavy enough tails that even finite first moment fails.
These aren't just curiosities: real heavy-tailed phenomena (financial returns, network packet sizes, city sizes) can have distributions for which the empirical mean is unstable. Robust statistics (§4.5) becomes essential.
Try it
- Default Normal(0, 1). Three independent traces (different seeds) ALL approach the red μ = 0 line as n grows. By n = 1000 the spread across paths is ≈ 0.06 (matches σ/√n = 1/√1000 ≈ 0.032).
- Switch to Exponential(1). Mean = 1. Convergence is slower because the distribution is skewed. By n = 1000, paths still bracket μ = 1 within ±0.05.
- Switch to Cauchy(0, 1). Spread across paths does NOT shrink with n — sometimes the running mean jumps wildly even at n = 5000. This is LLN's failure case. The Cauchy is heavy-tailed enough that a single extreme sample can shift the running mean indefinitely.
- Switch to Bernoulli(0.3). Discrete; both possible values 0 and 1. Sample mean converges to 0.3 cleanly. At n = 100 paths typically within ±0.05 of 0.3; at n = 1000 within ±0.015.
- Slide n_max from 100 to 5000 with Normal. Verify the convergence visually: at n = 100 paths still drift; at n = 5000 they hug μ tightly. The PROPORTIONAL improvement is 1/√(5000/100) = 1/√50 ≈ 14% — a 50× n only gives ~7× better precision.
A pollster reports a survey of 1000 voters with margin of error ±3%. A follow-up survey doubles n to 2000. What new margin of error should they report (and why is it > ±1.5%)?
What you now know
Sample means converge to population means at rate 1/√n — but only for distributions with finite mean. Pathological heavy-tailed distributions (Cauchy) break the law entirely. Robust estimators (median, trimmed means) can recover convergence properties even for slightly-heavy-tailed distributions. §0.7 takes the next step: the CLT tells us how the sample mean is distributed around μ — bell-shaped with width σ/√n.
References
- Wasserman, L. (2004). All of Statistics. Springer. (Chapter 5 — convergence of random variables.)
- Billingsley, P. (1995). Probability and Measure, 3rd ed. Wiley. (Chapter 6 — strong LLN.)
- Etemadi, N. (1981). "An elementary proof of the strong law of large numbers." Zeitschrift für Wahrscheinlichkeitstheorie 55, 119-122.
- Feller, W. (1968). An Introduction to Probability Theory and Its Applications, Vol. 1. Wiley. (Chapter 10.)
- Tukey, J.W. (1960). "A survey of sampling from contaminated distributions." (Robust statistics motivation when LLN works but slowly under contamination.)