Poisson and negative-binomial counts

Part 5 — Generalised linear models

Learning objectives

Fit Poisson regression with log link for count data
Interpret coefficients as multiplicative effects on the count rate
Diagnose overdispersion (Var(Y|X) > E[Y|X])
Switch to negative-binomial when overdispersion is detected
Apply offsets for exposure-adjusted rate models

Count outcomes — number of events in a time interval, number of accidents per region, ER admissions per day — are not Normal. Poisson regression models them.

The Poisson model

Y_i | X_i \sim \mathrm{Poisson}(\mu_i), \quad \log \mu_i = \mathbf{x}_i^T \boldsymbol{\beta}.

The log link is canonical for Poisson: $\mu = e^{\eta}$ is always positive. The variance is constrained: $\mathrm{Var}(Y_i|X_i) = \mu_i = E[Y_i|X_i]$ — variance equals mean.

Coefficient interpretation: multiplicative

For unit increase in $x_j$ , $\mu$ multiplies by $e^{\beta_j}$ :

\mu(\mathbf{x} + \mathbf{e}_j) / \mu(\mathbf{x}) = e^{\beta_j}.

Example: a coefficient of 0.5 means the count is 1.65× higher per unit of x_j. Coefficients on the LOG scale; effects on the RATE scale are multiplicative.

Overdispersion

Poisson assumes Var = Mean. Real count data often has Var >> Mean (overdispersion) due to unobserved heterogeneity, clustering, or temporal correlation. Diagnose via:

Pearson dispersion: $\hat{\phi} = \frac{1}{n - p} \sum \frac{(Y_i - \hat{\mu}_i)^2}{\hat{\mu}_i}$ . Should be ≈ 1 under Poisson; $\hat{\phi} > 1.5$ indicates overdispersion.
Residuals-vs-fitted plot: variance growing faster than mean.

Negative binomial: the standard fix

Negative binomial allows Var(Y) = μ + αμ². The dispersion parameter α captures over-dispersion: α=0 reduces to Poisson; larger α = more spread. Fitted via the same IRLS machinery; R: MASS::glm.nb().

Offsets: rate models

To model RATES (events per unit exposure), include log-exposure as an OFFSET — a coefficient fixed at 1. R: glm(Y ~ X, offset = log(exposure), family = poisson). The model then estimates the rate per unit exposure rather than the raw count.

Zero-inflated counts

When zero counts are over-represented (e.g., medical visits per year — many people have 0), use zero-inflated Poisson (ZIP) or zero-inflated negative binomial (ZINB). Mixes a point mass at 0 with the count distribution.

Hands-on dispersion diagnostics

The widget below lets you choose a true count DGP (Poisson, Negative Binomial, or Zero-Inflated Poisson), generate samples, and watch the diagnostic statistics live. The dispersion gauge flashes red when φ̂ > 2 (strong overdispersion), amber for 1.2-2 (mild), green for 0.9-1.2 (≈ Poisson). The histogram overlays both the fitted Poisson and NB PMFs so you can SEE which family fits better.

Try it

Set DGP = Pure Poisson with μ = 4. Note φ̂ stays near 1, and the Poisson curve (blue) matches the histogram well. The Q-Q plot of Pearson residuals hugs the diagonal — Poisson is correctly specified.
Switch DGP to Negative Binomial with μ = 4, k = 1. Notice φ̂ jumps to 4-7 (strong overdispersion). The Poisson PMF (blue) under-predicts the tails; the NB PMF (purple) tracks the histogram. Q-Q residuals fan out at the extremes.
Crank NB k to 30. The two curves overlap and φ̂ ≈ 1 — large k is the Poisson limit. Compare with k = 0.5 (very high dispersion): the histogram becomes long-tailed and the Poisson fit completely fails.
Switch to Zero-Inflated with μ = 4, π₀ = 0.4. The "Excess zeros" readout becomes substantially positive (≈ +0.4 - 0.4·e^-4 ≈ +0.39). Poisson PMF predicts ≈ 4 zeros for n=200 but you see ~80+. This is the zero-inflation signature; Poisson can't represent it even with the right μ.
Set DGP = Pure Poisson but make n very small (n = 30). Re-seed several times. Notice φ̂ now fluctuates wildly, sometimes flagging "mild overdispersion" even on correctly-specified Poisson data. Small samples make dispersion diagnostics noisy — always check whether you have ENOUGH data to trust the gauge.

A reservoir engineer fits Poisson regression to "number of failed wells per field" and finds φ̂ = 5.3. The fix is "use negative binomial" — but WHY does NB rescue them? In one sentence, what physical or statistical structure does the NB's extra k parameter capture that pure Poisson cannot? (Hint: think of unmodelled heterogeneity across fields.)

References

Cameron, A.C., Trivedi, P.K. (2013). Regression Analysis of Count Data, 2nd ed. Cambridge.
Hilbe, J.M. (2014). Modeling Count Data. Cambridge.
Lambert, D. (1992). "Zero-inflated Poisson regression." Technometrics 34(1), 1–14.
McCullagh, P., Nelder, J.A. (1989). Generalized Linear Models, 2nd ed.