Accuracy plots and reliability diagnostics

Part 6, Cross-validation and QC

Learning objectives

Construct an ACCURACY PLOT comparing predicted vs actual probability intervals
Diagnose KRIGING-VARIANCE OVER-STATEMENT vs UNDER-STATEMENT visually
Apply accuracy plots to ANY model with predicted uncertainty (kriging, simulation, ML)
Combine accuracy plots with LOO-CV for comprehensive uncertainty validation
Use accuracy plots to support variogram-parameter refinement

LOO-CV (§6.1) gives a single variance-ratio number. The ACCURACY PLOT (Goovaerts 1997, Pyrcz-Deutsch 2014) is the geostat-specific extension of the reliability diagram (§9.4 in SDS): for each candidate confidence level p, compute the ACTUAL fraction of true values falling within the predicted p-interval. A well-calibrated kriging variance has the actual fraction matching the predicted level at every p.

Construction

From LOO-CV (or any held-out test), have N (predicted, kriging-variance, actual) triples.
For each candidate probability level $p \in {0.1, 0.2, \ldots, 0.95}$ :
Compute the z-score $z_p = \Phi^{-1}(0.5 + p/2)$ .
Count the fraction of triples where the actual value falls within $\hat{z}_K(x_i) \pm z_p \sqrt{\sigma_K^2(x_i)}$ .
Plot the fraction vs the predicted level.

The diagonal y = x represents perfect calibration. Departures indicate over/under-confidence.

Interpretation

Below diagonal: kriging variance UNDER-STATES uncertainty. The model claims 70% confidence; the data only achieves 60% coverage. Variogram parameters too low.
Above diagonal: kriging variance OVER-STATES uncertainty. The model claims 70% confidence; the data shows 80% coverage. Variogram parameters too generous.
On diagonal: well-calibrated.

Why both LOO-CV and accuracy plot are needed

LOO-CV variance ratio gives a single aggregate calibration number, useful for comparing variograms. The accuracy plot shows calibration ACROSS confidence levels, sometimes a model is over-confident at the tails but well-calibrated at the centre, or vice versa. Both diagnostics together provide a complete uncertainty-calibration picture.

Comparison with binary-classification reliability diagram

In §9.4 (SDS), reliability diagrams compared predicted probabilities to empirical frequencies for binary classification. The accuracy plot is the GEOSTAT-SPECIFIC analogue for CONTINUOUS predictions with associated variance: the comparison is between predicted COVERAGE intervals and empirical COVERAGE rates. Both diagnostic tools share the same shape and interpretation.

Try it

Default: kvar scale = 1.0. The accuracy curve lies on the diagonal, well calibrated. Each predicted probability matches the actual fraction.
Drag kvar scale to 0.5. The model under-states uncertainty by half. The actual fractions fall BELOW the predicted, for 70% interval, only ~50% actual coverage. Over-confidence.
Drag kvar scale to 2.0. The model over-states uncertainty. Actual fractions exceed predicted, 95% intervals contain ~98% of data. Under-confidence.
Increase N to 2000. The curve becomes smoother (less Monte Carlo noise). The calibration error is the true signal.
The asymmetry between over and under-statement matters. Over-confident models lead to UNDER-PREDICTED uncertainty in downstream analyses (resource estimation, risk assessment). Modern geostat best practice: aim for well-calibrated accuracy across all confidence levels.

An accuracy plot shows the curve consistently below the diagonal at high confidence levels (above 80%) but on the diagonal at low confidence levels. Diagnose this pattern and recommend a fix.

What you now know

The accuracy plot extends LOO-CV variance ratio to show kriging-variance calibration ACROSS all confidence levels. Above-diagonal = over-stated uncertainty; below-diagonal = under-stated uncertainty. Modern geostat reporting includes both LOO-CV statistics and the accuracy plot. §6.3 next: calibration of the kriging variance, what to do when the accuracy plot reveals miscalibration.

References

Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford. (Accuracy plot introduced.)
Pyrcz, M.J., Deutsch, C.V. (2014). Geostatistical Reservoir Modeling, 2nd ed. Oxford.
Deutsch, C.V. (1997). "Direct assessment of local accuracy and precision." Geostatistics Wollongong '96. (Foundational paper.)
Manchuk, J.G., Deutsch, C.V. (2010). "Quality assurance of geostatistical models." Geostatistics Banff 2008, Springer.
Pyrcz, M.J. (2024). "Geostatistics's missing accuracy diagnostic." Substack post, gives a modern operational checklist for kriging-variance calibration in reservoir-engineering practice.