Simulation as intuition

Probability from zero

Learning objectives

  • Use Monte Carlo to estimate probabilities and expectations by averaging i.i.d. samples
  • Apply the inverse-CDF sampling method to generate non-Uniform random variables
  • Recognise the 1/√n Monte Carlo error rate and use it for sample-size planning
  • Use simulation to verify or stress-test analytical results

Monte Carlo is the practical bridge from probability theory to computation. Given any quantity that can be expressed as E[g(X)] — a probability, an integral, an expectation — generate many i.i.d. samples and average. By LLN the mean converges; by CLT the error is approximately Normal with SD σ/√n. Monte Carlo is universal: works on multi-dimensional integrals where deterministic quadrature fails, on probability calculations with no closed form, on derived-quantity uncertainty propagation.

The Monte Carlo principle

To estimate θ=E[g(X)]\theta = E[g(X)], draw X1,X2,,XnX_1, X_2, \ldots, X_n i.i.d. from the distribution of XX, then:

θ^n=1ni=1ng(Xi).\hat{\theta}_n = \frac{1}{n} \sum_{i=1}^{n} g(X_i).

By LLN (§0.6), θ^nθ\hat{\theta}_n \to \theta as nn \to \infty. By CLT (§0.7), the error has approximate distribution N(0,σ2/n)N(0, \sigma^2/n) where σ2=Var(g(X))\sigma^2 = \mathrm{Var}(g(X)). The 95 % CI is θ^n±1.96σ^/n\hat{\theta}_n \pm 1.96,\hat{\sigma}/\sqrt{n}.

Common Monte Carlo patterns

  • Hit-or-miss probability: estimate P(A)(1/n)i1{XiA}P(A) \approx (1/n) \sum_i \mathbb{1}{X_i \in A}. Used to estimate π, integration over irregular regions, P-values.
  • Direct expectation: estimate E[X](1/n)XiE[X] \approx (1/n) \sum X_i. Used in option pricing, simulation of physical systems, Bayesian posterior means.
  • Importance sampling: estimate E[g(X)]=E[g(X)f(X)/q(X)]E[g(X)] = E[g(X) f(X)/q(X)] where qq is a different proposal distribution. Used when sampling from ff is hard or rare events dominate.

Inverse-CDF sampling

If UUniform(0,1)U \sim \text{Uniform}(0, 1) and FF is a CDF with quantile function F1F^{-1}, then F1(U)FF^{-1}(U) \sim F. This converts uniform random numbers (provided by every PRNG) into draws from any distribution. Practical:

  • Exponential: X=ln(U)/λX = -\ln(U)/\lambda since the Exp CDF is F(x)=1eλxF(x) = 1 - e^{-\lambda x}.
  • Box-Muller: Z1=2lnU1cos(2πU2)Z_1 = \sqrt{-2\ln U_1}\cos(2\pi U_2), Z2=2lnU1sin(2πU2)Z_2 = \sqrt{-2\ln U_1}\sin(2\pi U_2) are independent N(0, 1).
  • Discrete: X=min{k:F(k)U}X = \min{k : F(k) \ge U} — binary-search the CDF.

For complex distributions where F1F^{-1} has no closed form, use REJECTION SAMPLING or MARKOV CHAIN MONTE CARLO (Part 7).

Convergence rate and sample size

Error shrinks as 1/√n. To halve the MC error, quadruple n. To achieve absolute error ε\varepsilon: n(1.96σ/ε)2n \approx (1.96 \sigma / \varepsilon)^2. For estimating a probability pp with absolute precision 0.01: n(1.96)2p(1p)/0.01238416p(1p)n \approx (1.96)^2 p(1-p) / 0.01^2 \approx 38416 \cdot p(1-p). Worst-case p=0.5p = 0.5 gives n9604n \approx 9604. Tighter precision requires quadratically more samples.

When NOT to use MC: deterministic alternatives

MC's 1/√n rate is much worse than analytical / numerical alternatives when those exist:

  • 1-D integrals: numerical quadrature (Simpson, Gauss) converges at O(1/np)O(1/n^p) for smooth integrands; way faster than MC.
  • Closed-form analytic results (Normal CDFs, Gamma functions): always preferable.

Where MC wins: high-dimensional integration, stochastic dynamics, Bayesian posteriors with intractable likelihoods, model-based decision analysis with no closed form. The 1/√n rate is dimension-INDEPENDENT — that's the killer feature.

Monte Carlo IntuitionInteractive figure — enable JavaScript to interact.

Try it

  • "π by hit-or-miss" at n = 100. The estimate is around 3.0-3.4 with 95% CI roughly ±0.18. At n = 10000 the CI is ±0.02. At n = 100000 it's ±0.007. The 1/√n shrinkage is visible.
  • "∫_0^1 sin²(x) dx" at any n. True value is 0.5 - sin(2)/4 ≈ 0.2727. Verify CI coverage: the truth lies inside the 95% CI most of the time across different seeds.
  • "P(|X̄_50 - 1| < 0.2) for Exp(1)" — uses CLT to estimate a probability. Truth ≈ 0.842 (the CLT's prediction). At n = 5000, error ≈ 0.005 to 0.01.
  • Reseed several times for the same scenario and n. The estimate fluctuates by approximately the CI half-width, confirming that the CI is calibrated.
  • Push n from 5000 to 20000. The 95% CI band shrinks by factor of √(20000/5000) = 2. Visible improvement; doubles confidence.

You want a Monte Carlo estimate of a true value 0.05 (a rare-event probability) with absolute precision ±0.001. How many samples do you need? Why is rare-event simulation a particularly expensive special case of MC?

What you now know — and Part 0 closes

Monte Carlo bridges probability theory to computational practice. The 1/√n error rate is universal but slow; for hard problems (high-dim integrals, intractable posteriors) it's the only game in town. Inverse-CDF sampling provides the link from PRNG output to arbitrary distributions. Part 0 is now COMPLETE: you have the foundations of probability axioms (§0.1), random variables (§0.2), joint and conditional distributions (§0.3), moments (§0.4), the catalog of common distributions (§0.5), LLN (§0.6), CLT (§0.7), transformations (§0.8), MGFs (§0.9), and simulation (§0.10). Part 1 (Estimation) builds the inferential toolkit on these foundations.

References

  • Robert, C.P., Casella, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer.
  • Liu, J.S. (2001). Monte Carlo Strategies in Scientific Computing. Springer.
  • Owen, A.B. (2013). Monte Carlo theory, methods and examples. Online: statweb.stanford.edu/~owen/mc/. (Excellent free textbook.)
  • Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer. (Classic reference for sampling algorithms.)
  • Metropolis, N., Ulam, S. (1949). "The Monte Carlo method." JASA 44(247), 335-341. (The foundational paper, coined the term.)

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.