From linear to generalised: link and family
Learning objectives
- State the three GLM components: random (response distribution from exponential family), systematic (linear predictor η = Xβ), link (g connecting μ to η)
- Identify the canonical link for Normal, Binomial, Poisson, Gamma
- Fit GLMs via Iteratively Reweighted Least Squares (IRLS)
- Map OLS as a GLM with Normal family + identity link
Linear regression assumes the response Y is conditionally Normal with constant variance. This breaks for binary outcomes (Y in {0,1}), count outcomes (Y in 0,1,2,...), positive-skewed continuous outcomes (lifetimes, costs), and proportions. Generalised Linear Models extend the OLS framework to these cases, keeping the linear-in-parameters predictor while replacing Normality with a more general exponential-family distribution.
The three GLM components
- Random component: Y_i has a distribution in the exponential family — Normal, Binomial, Poisson, Gamma, Inverse Gaussian, Negative Binomial, Multinomial, etc. With mean and variance where is the variance function and the dispersion parameter.
- Systematic component: linear predictor .
- Link function: invertible such that . The link CONSTRAINS μ to a valid range.
Canonical links
- Normal: identity link, . Recovers OLS.
- Binomial (binary, proportions): logit link, . Logistic regression.
- Poisson (counts): log link, . Poisson regression.
- Gamma (positive continuous): inverse link, (more common in practice: log link).
The "canonical" link is the one for which the linear predictor equals the natural parameter of the exponential-family distribution. Using the canonical link gives the nicest properties — sufficient statistics in closed form, IRLS converges quickly — but ANY invertible link with the right range can work.
IRLS fitting
GLMs are fitted by Iteratively Reweighted Least Squares: at each iteration, form a "working response" and a working weight , then run weighted least squares. Converges quadratically for canonical links. Implemented in R's glm() and Python's statsmodels GLM.
Visualising the four canonical links
The same linear predictor η = β₀ + β₁·x flows through four different inverse-link functions to produce μ in four very different ranges. Move the sliders — note that:
- Normal · identity permits any μ, including negative values (which are absurd for counts, proportions, or lifetimes).
- Binomial · logit squashes η ∈ ℝ into (0, 1) — large positive η ⇒ μ → 1, large negative η ⇒ μ → 0.
- Poisson · log maps η ∈ ℝ to (0, ∞) — small η changes have multiplicative effects on μ.
- Gamma · inverse is only well-defined for η > 0; outside that range μ is undefined.
Try it
- Set β₁ = 0 and slide β₀ from -3 to +3. Watch how the Binomial μ moves from near 0 to near 1 along an S-curve, while the Poisson μ scales from ≈ 0.05 to ≈ 20. Same η range, totally different μ behaviour.
- Set β₀ = -2, β₁ = 0.5. For which x-values is the Gamma panel UNDEFINED? Convince yourself that the inverse link is unsuitable when η can cross zero — log-link Gamma is the workhorse alternative.
- Set β₁ = 1.5. In the Binomial panel, how steep is the sigmoid near η = 0? Now flatten it: β₁ = 0.2. The same one-unit change in x produces a much smaller change in μ when β₁ is small (logistic regression's "marginal effect" depends on where you are on the curve).
- Crank n samples up to 200 and look at the Poisson panel: do the dots span the full y-range or hug the lower portion? At η = -2, μ ≈ 0.14 — almost every sample is 0. This is why Poisson regression on rare events needs lots of data.
- Try β₀ = 3, β₁ = 0. The Normal panel happily predicts μ = 3 for all x. The Gamma panel correctly gives μ = 1/3 ≈ 0.33. The Binomial panel saturates at μ ≈ 0.95. The Poisson panel gives μ ≈ 20. Four totally different stories from the same η.
If your response is a measured proportion (e.g., germination rate across 200 trials), and you accidentally fit OLS instead of logistic regression, what TWO concrete things go wrong? Hint: think about predicted values that exceed (0, 1), and variance that is wrongly assumed constant when it actually depends on μ.
What you now know
GLMs extend OLS to non-Normal responses via the family + link framework. The link is not a stylistic choice — it CONSTRAINS μ to a valid range and ties the variance structure to the mean. §5.2-5.4 cover the two most important non-Normal cases (logistic and Poisson) and the GLM-specific diagnostics. §5.5 introduces mixed-effects extensions for clustered data. §5.6 closes Part 5 with honest scope: what GLM cannot do.
References
- Nelder, J.A., Wedderburn, R.W.M. (1972). "Generalized linear models." J. Roy. Stat. Soc. A 135(3), 370–384. (The foundational GLM paper.)
- McCullagh, P., Nelder, J.A. (1989). Generalized Linear Models, 2nd ed. Chapman & Hall. (The canonical book.)
- Agresti, A. (2015). Foundations of Linear and Generalized Linear Models. Wiley.
- Dobson, A.J., Barnett, A.G. (2018). An Introduction to Generalized Linear Models, 4th ed. Chapman & Hall.
- Wood, S.N. (2017). Generalized Additive Models: An Introduction with R, 2nd ed.