Jackknife and split-sample validation

Part 6 — Cross-validation and QC

Learning objectives

  • Define JACKKNIFE as the leave-one-out resampling estimator of bias and variance
  • Distinguish LOO-CV (each point once as test) from SPLIT-SAMPLE (random hold-out fraction)
  • Recognise the BIAS-VARIANCE trade-off between the two approaches
  • Apply k-FOLD CV as a compromise between LOO and split-sample for large datasets
  • Choose the appropriate validation scheme given dataset size and computational budget

Cross-validation has several variants for geostatistics. §6.1 introduced LOO-CV (leave-one-out). §6.4 develops the JACKKNIFE perspective and contrasts with SPLIT-SAMPLE validation. Each has trade-offs in bias, variance, and computational cost.

The jackknife (Quenouille 1956, Tukey 1958)

The jackknife is a precursor to the bootstrap. For an estimator T(y)T(y):

  • For each i: compute T(i)=T(yi)T_{(-i)} = T(y_{-i}) (estimator on data without point i).
  • The N pseudo-values T(i)T_{(-i)} provide bias and variance estimates.
  • Jackknife bias estimate: Bias^=(N1)(T()T)\hat{\text{Bias}} = (N - 1)(T_{(\cdot)} - T) where T()T_{(\cdot)} is the mean of the pseudo-values.
  • Jackknife variance: V^(T)=N1N(T(i)T())2\hat{V}(T) = \frac{N - 1}{N} \sum (T_{(-i)} - T_{(\cdot)})^2.

For kriging, the jackknife IS leave-one-out cross-validation: the pseudo-values are the kriging predictions z^(i)\hat{z}{(-i)} and the residuals ziz^(i)z_i - \hat{z}{(-i)} provide the basis for variance and bias estimates.

Split-sample validation

An alternative: randomly partition the data into training (say 70%) and test (30%). Fit on training, predict on test, compute MSE. Repeat with different random partitions and average.

Advantages: simpler to interpret; computationally cheap; each test point gets a fresh, unbiased prediction. Disadvantages: smaller training set (predictions slightly worse than LOO); MSE estimate is biased upward (training on fewer than N-1 points); high variance across splits unless many partitions are averaged.

k-fold CV as a compromise

k-fold CV (k = 5 or 10): partition data into k folds; for each fold, train on the others and test on it; average. Special cases: k=2 = split-sample with 50% holdout; k=N = LOO-CV.

k=5 is a typical choice: balance between LOO (k=N, low bias high variance) and split-sample (k=2, higher bias lower variance). For VERY LARGE datasets, k=5 is computationally tractable while LOO is not.

The bias-variance trade-off

MethodTraining sizeBiasVarianceCost
LOO-CVN-1MinimalHighN × kriging
k-fold (k=10)~0.9NSmallModeratek × kriging
k-fold (k=5)~0.8NModestLowerk × kriging
Split-sample (30%)0.7NLargestLowest1 kriging

For geostatistical datasets (typically N=50–500), LOO-CV is both feasible and optimal. For VERY LARGE datasets (>10⁵ points), k=5 or k=10 is more tractable.

Spatial cross-validation: a critical caveat

For SPATIAL data, a critical issue: random k-fold partitioning may put POINTS NEAR EACH OTHER in different folds. Spatially-adjacent points are correlated, so the model "cheats" by learning correlated information from nearby training points. This makes the CV MSE artificially OPTIMISTIC.

Fix: SPATIAL k-fold CV. Partition the data into K spatially-CONTIGUOUS blocks. Each block forms a test fold; training set is the rest. This forces predictions to be made at locations FAR from the training data — a more honest assessment of model generalisation. Standard in modern spatial-stats literature; see Roberts et al. (2017) "Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure".

Jackknife SplitInteractive figure — enable JavaScript to interact.

Try it

  • Defaults: N=40, split=30%. LOO and split-sample give similar MSE estimates with comparable SEs.
  • Drop split fraction to 50%. Split-sample MSE estimate increases (training on 20 points instead of 28). LOO is unaffected — always uses N-1=39 training points.
  • Crank N to 100. LOO is exact; split-sample MSE estimate stabilises (less variance across splits, more data in each split).
  • Re-sample multiple times. LOO MSE is more stable across resamples than split-sample (which averages 20 random splits).
  • The takeaway: for moderate N (50–500), LOO is unambiguously preferred for geostat applications. Split-sample is for cases where LOO is prohibitively expensive (e.g., N > 10⁵).

For a spatial-prediction model with N = 200 clustered samples, why might random 5-fold CV give an over-optimistic MSE estimate, and what spatial-CV alternative would you recommend?

What you now know

Jackknife = LOO-CV mathematically. Split-sample is a faster, biased alternative. k-fold CV interpolates between them. Spatial CV partitions into contiguous blocks to defeat the spatial-correlation cheat. Modern geostat best practice: LOO-CV for moderate N; spatial k-fold for spatial datasets where spatially-random CV would over-estimate model quality. §6.5 closes Part 6 with debiasing checks and conditional bias.

References

  • Quenouille, M.H. (1956). "Notes on bias in estimation." Biometrika 43, 353–360. (Original jackknife.)
  • Tukey, J.W. (1958). "Bias and confidence in not-quite large samples." Annals of Mathematical Statistics 29, 614. (Jackknife extension.)
  • Roberts, D.R., et al. (2017). "Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure." Ecography 40, 913–929.
  • Brenning, A. (2012). "Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing." IEEE IGARSS.
  • Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.