Poisson and negative-binomial counts

Part 5 — Generalised linear models

Learning objectives

  • Fit Poisson regression with log link for count data
  • Interpret coefficients as multiplicative effects on the count rate
  • Diagnose overdispersion (Var(Y|X) > E[Y|X])
  • Switch to negative-binomial when overdispersion is detected
  • Apply offsets for exposure-adjusted rate models

Count outcomes — number of events in a time interval, number of accidents per region, ER admissions per day — are not Normal. Poisson regression models them.

The Poisson model

YiXiPoisson(μi),logμi=xiTβ.Y_i | X_i \sim \mathrm{Poisson}(\mu_i), \quad \log \mu_i = \mathbf{x}_i^T \boldsymbol{\beta}.

The log link is canonical for Poisson: μ=eη\mu = e^{\eta} is always positive. The variance is constrained: Var(YiXi)=μi=E[YiXi]\mathrm{Var}(Y_i|X_i) = \mu_i = E[Y_i|X_i] — variance equals mean.

Coefficient interpretation: multiplicative

For unit increase in xjx_j, μ\mu multiplies by eβje^{\beta_j}:

μ(x+ej)/μ(x)=eβj.\mu(\mathbf{x} + \mathbf{e}_j) / \mu(\mathbf{x}) = e^{\beta_j}.

Example: a coefficient of 0.5 means the count is 1.65× higher per unit of x_j. Coefficients on the LOG scale; effects on the RATE scale are multiplicative.

Overdispersion

Poisson assumes Var = Mean. Real count data often has Var >> Mean (overdispersion) due to unobserved heterogeneity, clustering, or temporal correlation. Diagnose via:

  • Pearson dispersion: ϕ^=1np(Yiμ^i)2μ^i\hat{\phi} = \frac{1}{n - p} \sum \frac{(Y_i - \hat{\mu}_i)^2}{\hat{\mu}_i}. Should be ≈ 1 under Poisson; ϕ^>1.5\hat{\phi} > 1.5 indicates overdispersion.
  • Residuals-vs-fitted plot: variance growing faster than mean.

Negative binomial: the standard fix

Negative binomial allows Var(Y) = μ + αμ². The dispersion parameter α captures over-dispersion: α=0 reduces to Poisson; larger α = more spread. Fitted via the same IRLS machinery; R: MASS::glm.nb().

Offsets: rate models

To model RATES (events per unit exposure), include log-exposure as an OFFSET — a coefficient fixed at 1. R: glm(Y ~ X, offset = log(exposure), family = poisson). The model then estimates the rate per unit exposure rather than the raw count.

Zero-inflated counts

When zero counts are over-represented (e.g., medical visits per year — many people have 0), use zero-inflated Poisson (ZIP) or zero-inflated negative binomial (ZINB). Mixes a point mass at 0 with the count distribution.

Hands-on dispersion diagnostics

The widget below lets you choose a true count DGP (Poisson, Negative Binomial, or Zero-Inflated Poisson), generate samples, and watch the diagnostic statistics live. The dispersion gauge flashes red when φ̂ > 2 (strong overdispersion), amber for 1.2-2 (mild), green for 0.9-1.2 (≈ Poisson). The histogram overlays both the fitted Poisson and NB PMFs so you can SEE which family fits better.

Poisson Overdispersion DemoInteractive figure — enable JavaScript to interact.

Try it

  • Set DGP = Pure Poisson with μ = 4. Note φ̂ stays near 1, and the Poisson curve (blue) matches the histogram well. The Q-Q plot of Pearson residuals hugs the diagonal — Poisson is correctly specified.
  • Switch DGP to Negative Binomial with μ = 4, k = 1. Notice φ̂ jumps to 4-7 (strong overdispersion). The Poisson PMF (blue) under-predicts the tails; the NB PMF (purple) tracks the histogram. Q-Q residuals fan out at the extremes.
  • Crank NB k to 30. The two curves overlap and φ̂ ≈ 1 — large k is the Poisson limit. Compare with k = 0.5 (very high dispersion): the histogram becomes long-tailed and the Poisson fit completely fails.
  • Switch to Zero-Inflated with μ = 4, π₀ = 0.4. The "Excess zeros" readout becomes substantially positive (≈ +0.4 - 0.4·e^-4 ≈ +0.39). Poisson PMF predicts ≈ 4 zeros for n=200 but you see ~80+. This is the zero-inflation signature; Poisson can't represent it even with the right μ.
  • Set DGP = Pure Poisson but make n very small (n = 30). Re-seed several times. Notice φ̂ now fluctuates wildly, sometimes flagging "mild overdispersion" even on correctly-specified Poisson data. Small samples make dispersion diagnostics noisy — always check whether you have ENOUGH data to trust the gauge.

A reservoir engineer fits Poisson regression to "number of failed wells per field" and finds φ̂ = 5.3. The fix is "use negative binomial" — but WHY does NB rescue them? In one sentence, what physical or statistical structure does the NB's extra k parameter capture that pure Poisson cannot? (Hint: think of unmodelled heterogeneity across fields.)

References

  • Cameron, A.C., Trivedi, P.K. (2013). Regression Analysis of Count Data, 2nd ed. Cambridge.
  • Hilbe, J.M. (2014). Modeling Count Data. Cambridge.
  • Lambert, D. (1992). "Zero-inflated Poisson regression." Technometrics 34(1), 1–14.
  • McCullagh, P., Nelder, J.A. (1989). Generalized Linear Models, 2nd ed.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.