Leave-one-out cross-validation for kriging

Part 6 — Cross-validation and QC

Learning objectives

  • Define LEAVE-ONE-OUT CROSS-VALIDATION for geostatistical models
  • Compute the standard diagnostics: ME, RMSE, MAE, conditional bias
  • Check whether the assumed VARIOGRAM gives WELL-CALIBRATED kriging variance via the variance ratio
  • Diagnose under-fitting / over-fitting / variogram misspecification from CV residuals
  • Apply LOO-CV as the primary internal validation tool for any kriging deployment

You've built a variogram (Parts 3–4) and set up a kriging system (Part 5). The next question: is the model any good? LEAVE-ONE-OUT CROSS-VALIDATION (LOO-CV) provides the standard internal validation: hold each data point out, predict it from the rest using the assumed variogram, compute residuals. The residuals reveal the model's bias, accuracy, and uncertainty calibration.

The LOO-CV procedure

  • For each data point ii: temporarily REMOVE it from the dataset.
  • Use the remaining N - 1 points and the assumed variogram to predict z^i\hat{z}_i via ordinary kriging.
  • Compute the residual ri=ziz^ir_i = z_i - \hat{z}_i.
  • Repeat for all N points.

The collection of N residuals provides multiple validation diagnostics: Mean Error (bias), RMSE/MAE (accuracy), and the variance ratio (kriging-variance calibration).

Diagnostic 1: Mean Error (ME) — unbiasedness

ME=1Ni(ziz^i).\text{ME} = \frac{1}{N} \sum_i (z_i - \hat{z}_i).

For an UNBIASED predictor, ME should be approximately 0. Systematic positive ME indicates the kriging is UNDER-PREDICTING the data; negative ME, OVER-PREDICTING. Often this indicates a wrongly-assumed mean (for simple kriging) or a mis-specified drift.

Diagnostic 2: RMSE and MAE — accuracy

RMSE=1Ni(ziz^i)2,MAE=1Niziz^i.\text{RMSE} = \sqrt{\frac{1}{N} \sum_i (z_i - \hat{z}_i)^2}, \quad \text{MAE} = \frac{1}{N} \sum_i |z_i - \hat{z}_i|.

Both quantify prediction accuracy. RMSE is more sensitive to large residuals; MAE more robust to outliers. Compare across competing variogram models: smaller RMSE/MAE = better fit.

Diagnostic 3: Variance ratio — uncertainty calibration

The CRITICAL geostat diagnostic. The kriging variance σK2(x0)\sigma_K^2(x_0) at each predicted point quantifies the model's claimed uncertainty. If the variogram is well-specified, the SAMPLE residuals should have variance approximately equal to the average kriging variance:

Variance ratio=1Nri21NσK2(xi)1.\text{Variance ratio} = \frac{\frac{1}{N} \sum r_i^2}{\frac{1}{N} \sum \sigma_K^2(x_i)} \approx 1.

If ratio >> 1: residuals are LARGER than the kriging variance says — the model UNDER-STATES uncertainty (over-confidence). Variogram range may be too short; sill may be too low; nugget may be too small.

If ratio << 1: residuals are SMALLER than predicted — the model OVER-STATES uncertainty (under-confidence). Variogram parameters typically too generous.

Modern best practice: aim for ratio in [0.8, 1.2]. Larger deviations warrant variogram revision.

Conditional bias

Scatter predicted vs actual. Under perfect prediction, the cloud lies on the diagonal. SYSTEMATIC departures from the diagonal reveal CONDITIONAL BIAS:

  • Compressed range: predictions less variable than data — kriging smooths too aggressively. Typically variogram nugget is too high.
  • Expanded range: predictions more variable than data — kriging over-extrapolates. Rare; usually only with too-short ranges.
  • Slope departure: regression of actual on predicted should have slope 1. Slope < 1 means smoothing.

Interpreting CV results

SymptomLikely causeFix
ME ≠ 0Mean assumption (SK) or drift (UK) wrongUse OK or re-specify drift
RMSE highVariogram poorly fit OR no spatial signalRefit variogram, try more flexible models
Var ratio >> 1Variogram parameters too lowIncrease range or sill, add nugget
Var ratio << 1Variogram parameters too highDecrease range or sill
Conditional biasSmoothing (low slope of actual~pred)Reduce nugget
Loo Cv KrigingInteractive figure — enable JavaScript to interact.

Try it

  • Defaults: N = 30, range = 5.0, nugget = 0. The scatter shows predicted vs actual; residuals vs predicted. Variance ratio near 1 (well-calibrated).
  • Drop range to 1.0. Now the assumed variogram is TOO SHORT — kriging believes points are nearly uncorrelated. Variance ratio explodes (residuals larger than predicted variance) — under-stated uncertainty.
  • Crank range to 15.0. Variogram is TOO LONG — kriging over-confident in extending influence too far. Residuals smaller than predicted variance.
  • Add nugget 0.5. Effectively smoothes the prediction; conditional bias becomes visible (predicted range compressed vs actual).
  • Crank N from 30 to 80. With more data, kriging predictions improve uniformly; variance ratio stabilises closer to 1 (better calibrated).

A LOO-CV reports ME = 0.0, RMSE = 1.2, variance ratio = 2.5. Diagnose the issue and recommend a fix.

What you now know

LOO-CV is the gold-standard internal validation: remove each data point, predict from rest, compute residuals. The MUST-REPORT diagnostics: ME (bias), RMSE/MAE (accuracy), variance ratio (kriging-variance calibration). Modern geostat best practice: any kriging deployment without LOO-CV results is incomplete. §6.2 next: accuracy plots and reliability diagnostics — the geostat-specific reliability diagram for kriging-variance calibration.

References

  • Isaaks, E.H., Srivastava, R.M. (1989). An Introduction to Applied Geostatistics. Oxford. (Classical reference for cross-validation diagnostics.)
  • Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford. (Comprehensive treatment of CV diagnostics for various kriging variants.)
  • Pyrcz, M.J., Deutsch, C.V. (2014). Geostatistical Reservoir Modeling, 2nd ed. Oxford.
  • Deutsch, C.V., Journel, A.G. (1998). GSLIB, 2nd ed. Oxford. (LOO-CV implementation details.)
  • Chilès, J.-P., Delfiner, P. (2012). Geostatistics: Modeling Spatial Uncertainty, 2nd ed. Wiley.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.