Deviance, residuals, and GLM diagnostics

Part 5 — Generalised linear models

Learning objectives

  • Compute deviance residuals and Pearson residuals for any GLM
  • Use deviance for likelihood-ratio tests of nested GLMs
  • Apply the dispersion check for Poisson
  • Use Cook's-D-like influence diagnostics for GLMs

OLS diagnostics (§4.3) use residuals and the hat matrix. GLMs need analogous tools adapted to the non-Normal family.

Residual types in GLMs

  • Pearson residual: riP=(Yiμ^i)/V(μ^i)r^P_i = (Y_i - \hat{\mu}_i) / \sqrt{V(\hat{\mu}_i)}. Under correct model, has variance approximately 1.
  • Deviance residual: riD=sign(Yiμ^i)dir^D_i = \mathrm{sign}(Y_i - \hat{\mu}_i) \sqrt{d_i} where did_i is observation i's contribution to the deviance. More symmetric for skewed responses (e.g., Poisson with small μ_i).
  • Working residual: from the IRLS algorithm — what would be the residual at convergence in the linearised working model.

Deviance residuals are preferred for plots; Pearson residuals are easier to interpret as standardized errors.

Deviance for model comparison

The DEVIANCE D = -2(log L_model - log L_saturated). For nested models M_1 ⊆ M_2:

D1D2χp2p12D_1 - D_2 \sim \chi^2_{p_2 - p_1}

under the null that M_1 is correct. This is the GLM analog of the OLS F-test. Use for adding/dropping covariates.

Calibration plots

For logistic regression: bin predicted probabilities; compute observed proportion in each bin; plot bin centre vs observed. Perfect calibration = points on y=x. The Hosmer-Lemeshow test is the formal version of this.

Influence diagnostics for GLMs

  • Hat values: GLM's H=X(XTWX)1XTWH = X(X^T W X)^{-1} X^T W where W is the IRLS weights at convergence. Leverage hiih_{ii} measures influence of x_i on the linear predictor.
  • Cook's distance: same idea as OLS but using deviance / Pearson residuals + GLM hat matrix.
  • DFBETAS: analogous to OLS — per-coefficient influence.

References

  • McCullagh, P., Nelder, J.A. (1989). Generalized Linear Models. Chapter 2 on residuals and deviance.
  • Pregibon, D. (1981). "Logistic regression diagnostics." Annals of Statistics 9(4), 705–724. (The foundational GLM-diagnostics paper.)
  • Davison, A.C., Snell, E.J. (1991). "Residuals and diagnostics." In Statistical Theory and Modelling, Chapman & Hall.
  • Williams, D.A. (1987). "Generalized linear model diagnostics using the deviance and single case deletions." Applied Statistics 36(2), 181–191.
  • Cook, R.D. (1979). "Influential observations in linear regression." JASA 74(365), 169–174. (Origin of Cook's D.)

Residuals and influence in action

The widget below fits a GLM (logistic or Poisson) on simulated 1-D data and shows the three residual types as histograms. Deviance residuals are typically more symmetric and Normal-shaped — they are the preferred residuals for QQ plots and residual-vs-fitted diagnostics. The bottom panel shows leverage h_ii per observation, with a reference line at the rule-of-thumb threshold 2p/n.

Glm Residual ExplorerInteractive figure — enable JavaScript to interact.

Try it

  • Family = Logistic, n = 100. Compare the three histograms: Pearson and deviance look similar for logistic regression. Both are roughly N(0,1)-shaped.
  • Switch to Family = Poisson, then crank n down to 40. Notice the PEARSON residuals develop a noticeable right-skew (low μ regions produce large positive residuals while max negative is bounded by -√μ̂). Deviance residuals remain nearly symmetric — the very property that makes them preferred for plots when μ is small.
  • Inject outlier x = 10 (slider). Observation 0 now sits far in the tail of the x-distribution. Watch its leverage bar spike well above the orange threshold line. If the response value at that x is consistent with the fitted curve, Cook's D stays small; if inconsistent, it spikes too.
  • Inject outlier x = -8 (the other extreme). Same effect on leverage, opposite direction — leverage measures distance from the mean of x in WEIGHTED space, not magnitude of residual.
  • Now: outlier x = 10 AND seed = 42 (so the response is random). Sometimes the outlier's leverage flag is correct (high CookD); sometimes not (high leverage, low CookD because the response matched). High leverage is NECESSARY but not SUFFICIENT for influence — you need a bad fit AT that point too.

In a Poisson GLM with n=200 and one observation having h_ii = 0.18 (much greater than 2p/n = 0.02), Cook's D for that observation is 0.04 (small). What does this combination tell you about the observation — should you investigate it as a "problem point"? Justify in one sentence.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.