Deviance, residuals, and GLM diagnostics

Part 5 — Generalised linear models

Learning objectives

Compute deviance residuals and Pearson residuals for any GLM
Use deviance for likelihood-ratio tests of nested GLMs
Apply the dispersion check for Poisson
Use Cook's-D-like influence diagnostics for GLMs

OLS diagnostics (§4.3) use residuals and the hat matrix. GLMs need analogous tools adapted to the non-Normal family.

Residual types in GLMs

Pearson residual: $r^P_i = (Y_i - \hat{\mu}_i) / \sqrt{V(\hat{\mu}_i)}$ . Under correct model, has variance approximately 1.
Deviance residual: $r^D_i = \mathrm{sign}(Y_i - \hat{\mu}_i) \sqrt{d_i}$ where $d_i$ is observation i's contribution to the deviance. More symmetric for skewed responses (e.g., Poisson with small μ_i).
Working residual: from the IRLS algorithm — what would be the residual at convergence in the linearised working model.

Deviance residuals are preferred for plots; Pearson residuals are easier to interpret as standardized errors.

Deviance for model comparison

The DEVIANCE D = -2(log L_model - log L_saturated). For nested models M_1 ⊆ M_2:

D_1 - D_2 \sim \chi^2_{p_2 - p_1}

under the null that M_1 is correct. This is the GLM analog of the OLS F-test. Use for adding/dropping covariates.

Calibration plots

For logistic regression: bin predicted probabilities; compute observed proportion in each bin; plot bin centre vs observed. Perfect calibration = points on y=x. The Hosmer-Lemeshow test is the formal version of this.

Influence diagnostics for GLMs

Hat values: GLM's $H = X(X^T W X)^{-1} X^T W$ where W is the IRLS weights at convergence. Leverage $h_{ii}$ measures influence of x_i on the linear predictor.
Cook's distance: same idea as OLS but using deviance / Pearson residuals + GLM hat matrix.
DFBETAS: analogous to OLS — per-coefficient influence.

References

McCullagh, P., Nelder, J.A. (1989). Generalized Linear Models. Chapter 2 on residuals and deviance.
Pregibon, D. (1981). "Logistic regression diagnostics." Annals of Statistics 9(4), 705–724. (The foundational GLM-diagnostics paper.)
Davison, A.C., Snell, E.J. (1991). "Residuals and diagnostics." In Statistical Theory and Modelling, Chapman & Hall.
Williams, D.A. (1987). "Generalized linear model diagnostics using the deviance and single case deletions." Applied Statistics 36(2), 181–191.
Cook, R.D. (1979). "Influential observations in linear regression." JASA 74(365), 169–174. (Origin of Cook's D.)

Residuals and influence in action

The widget below fits a GLM (logistic or Poisson) on simulated 1-D data and shows the three residual types as histograms. Deviance residuals are typically more symmetric and Normal-shaped — they are the preferred residuals for QQ plots and residual-vs-fitted diagnostics. The bottom panel shows leverage h_ii per observation, with a reference line at the rule-of-thumb threshold 2p/n.

Try it

Family = Logistic, n = 100. Compare the three histograms: Pearson and deviance look similar for logistic regression. Both are roughly N(0,1)-shaped.
Switch to Family = Poisson, then crank n down to 40. Notice the PEARSON residuals develop a noticeable right-skew (low μ regions produce large positive residuals while max negative is bounded by -√μ̂). Deviance residuals remain nearly symmetric — the very property that makes them preferred for plots when μ is small.
Inject outlier x = 10 (slider). Observation 0 now sits far in the tail of the x-distribution. Watch its leverage bar spike well above the orange threshold line. If the response value at that x is consistent with the fitted curve, Cook's D stays small; if inconsistent, it spikes too.
Inject outlier x = -8 (the other extreme). Same effect on leverage, opposite direction — leverage measures distance from the mean of x in WEIGHTED space, not magnitude of residual.
Now: outlier x = 10 AND seed = 42 (so the response is random). Sometimes the outlier's leverage flag is correct (high CookD); sometimes not (high leverage, low CookD because the response matched). High leverage is NECESSARY but not SUFFICIENT for influence — you need a bad fit AT that point too.

In a Poisson GLM with n=200 and one observation having h_ii = 0.18 (much greater than 2p/n = 0.02), Cook's D for that observation is 0.04 (small). What does this combination tell you about the observation — should you investigate it as a "problem point"? Justify in one sentence.