Transformations of random variables

Probability from zero

Learning objectives

State the change-of-variable formula for monotonic transformations
Apply it for square, exp, shift, scale, and other common transformations
Recognise multi-valued inverse cases (e.g., Y = X²) and sum over pre-images
Use the delta method to approximate Var(g(X̄_n))
Recognise log-transforms as the standard fix for right-skewed positive data

Random variables are FUNCTIONS — and you can compose them with any (measurable) function g to form Y = g(X). The DISTRIBUTION of Y is determined by the distribution of X plus the structure of g. This section gives the formula and shows it in action.

The change-of-variable formula (continuous case)

If X has PDF $f_X$ and Y = g(X) for a strictly MONOTONIC and differentiable g with inverse $g^{-1}$ , then:

f_Y(y) = f_X(g^{-1}(y)) \cdot \left| \frac{d g^{-1}(y)}{dy} \right|.

The Jacobian $|dg^{-1}/dy|$ accounts for how g stretches or compresses small intervals.

Multi-valued inverses: sum over pre-images

For non-monotonic g (e.g., Y = X² maps both +√y and -√y to y), the formula extends:

f_Y(y) = \sum_{x : g(x) = y} f_X(x) \cdot \left| \frac{1}{g'(x)} \right|.

For Y = X² with $f_X$ symmetric about 0: $f_Y(y) = f_X(\sqrt{y}) / \sqrt{y}$ for y > 0. Most importantly: the support of Y is [0, ∞), even when X is symmetric.

Linear transformations: the cleanest case

Shift Y = X + a: $f_Y(y) = f_X(y - a)$ . Mean shifts by a; variance unchanged.
Scale Y = bX (b > 0): $f_Y(y) = f_X(y/b) / b$ . Mean scales by b; variance scales by b².

For Y = aX + b: $E[Y] = a E[X] + b$ , $\mathrm{Var}(Y) = a^2 \mathrm{Var}(X)$ . These rules underpin standardisation: Z = (X - μ)/σ has mean 0, variance 1.

Common non-linear transformations

Y = exp(X): if X is Normal, Y is LOG-NORMAL. Used for positive, right-skewed quantities (income, prices). Log-Normal mean = exp(μ + σ²/2), not exp(μ).
Y = X²: if X is Normal(0, 1), Y is chi-squared with 1 degree of freedom. Used in many test statistics.
Y = ln(X): inverse of exp; brings right-skewed positive data closer to Normal. Standard transformation for income, GDP, fold-change data.
Y = sin(X): periodic. Distribution depends on input range.

The DELTA METHOD: linear approximation for transformations of estimators

If √n(X̄_n - μ) →_d N(0, σ²), and g is differentiable at μ with g'(μ) ≠ 0, then:

\sqrt{n}(g(\bar{X}_n) - g(\mu)) \xrightarrow{d} \mathcal{N}(0, [g'(\mu)]^2 \sigma^2).

So transforming an asymptotically Normal estimator gives another asymptotically Normal estimator, with variance multiplied by $[g'(\mu)]^2$ . Used for: Wald CIs on odds ratios (g = exp), correlation Fisher z-transformation (g = tanh⁻¹), etc.

Try it

Pick Normal(0, 1) parent + Y = X². The right panel shows a chi-squared(1) distribution — heavily right-skewed with mass at 0. Even though X has E[X] = 0 and is symmetric, Y has E[Y] = E[X²] = 1.
Pick Normal(0, 1) parent + Y = exp(X). The right panel shows a LOG-NORMAL distribution — right-skewed, supported on [0, ∞). E[Y] = exp(0 + 1/2) = exp(0.5) ≈ 1.649 (NOT exp(0) = 1, despite E[X] = 0).
Pick Uniform(0, 1) + Y = X². The right panel shows the new shape — concentrated near 0 with a wedge toward 1. The empirical histogram matches the theoretical f_Y(y) = 1/(2√y).
Pick any parent + Y = X + 2 (shift). The right panel is the parent shifted right by 2. Same shape; just translated.
Pick any parent + Y = 2X (scale). The right panel is wider. Variance is multiplied by 4 (since scaling by 2 multiplies SD by 2 and variance by 2² = 4).

A biologist observes gene-expression measurements that are heavily right-skewed (a few high-expressing genes dominate). They report the arithmetic mean and SD. What ONE transformation should they consider, and what is the statistical justification? (Hint: think log-Normal.)

What you now know

Transformations are functions composed with random variables. The change-of-variable formula computes the new distribution. Multi-valued inverses (Y = X²) require summing over pre-images. The delta method handles transformations of asymptotic estimators. Log transformation is the standard fix for right-skewed positive data — it's why log-prices, log-income, and log-fold-change are everywhere in applied statistics. §0.9 introduces MGFs as a tool for handling transformations algebraically.

References

Casella, G., Berger, R.L. (2002). Statistical Inference, 2nd ed. (Section 2.1 — transformations.)
Wasserman, L. (2004). All of Statistics. Springer. (Section 5.5 — delta method.)
Box, G.E.P., Cox, D.R. (1964). "An analysis of transformations." JRSS-B 26(2), 211-252. (The Box-Cox parametric family of transformations.)
Pearl, J. (2000). Causality. Cambridge. (Transformations and causal-effect identification.)
Atkinson, A.C. (1985). Plots, Transformations, and Regression. Oxford. (Applied transformation choice in regression.)