Indicator methods: strengths and known pitfalls

Part 8 — Indicator methods

Learning objectives

Summarise the strengths of indicator methods: distribution-free, cutoff-flexible, categorical-native
Identify FIVE canonical pitfalls: variogram-only validation, order-relation propagation, hidden non-stationarity, neighbourhood specification, two-point ceiling
Apply a multi-statistic VALIDATION CHECKLIST beyond the indicator variogram
Choose between SISIM, SGS, plurigaussian, object-based, and MPS based on the failure modes of each
State conditions under which indicator methods are inadequate and an MPS / generative approach is warranted (§9)

Indicator methods (§8.1–§8.4) are a powerful family for non-Gaussian and categorical spatial data. Their flexibility — one variogram per cutoff/facies, no distributional assumption, direct probability interpretation — comes at the cost of additional modelling decisions and structural ceilings. This section closes Part 8 with an honest accounting of the strengths and the known failure modes, plus a practitioner's validation checklist.

Strengths recap

Distribution-free: no Gaussian or lognormal assumption; the conditional CDF is built empirically from indicators.
Cutoff-flexible: different cutoffs can have different variograms, capturing scale-dependent connectivity.
Categorical-native: facies, ore vs waste, land-use classes — all natural for the indicator framework.
Probability outputs: indicator kriging gives $P(z > c \mid \text{data})$ directly — the natural deliverable for exceedance-probability, facies-probability, and risk-of-failure maps.
SISIM realisations: discrete worlds for downstream simulators (flow, transport, mining) — not just probabilities.

Five canonical pitfalls

1. Variogram-only validation is necessary but NOT sufficient

The most insidious failure mode: an SISIM realisation can match every empirical indicator variogram while still failing to reproduce the truth's geometry. Variograms are two-point statistics — they describe the average squared difference between pairs at a given lag. NO two-point statistic can distinguish a stationary random field, a periodic field, and a linearly-trended field with the same proportions; their variograms can all look identical. Yet the spatial structure of each is utterly different.

Fix: validate against MULTIPLE structural metrics — not just the variogram. Suggested metrics:

Indicator variogram (necessary baseline)
Transition probabilities: $P(\text{facies}_{x+h} = j \mid \text{facies}_x = i)$ — reveal directional structure
Connectivity functions: probability that two points are in the same connected facies body
Facies-body size distribution: histogram of contiguous facies-region areas/volumes
Visual inspection of realisations against analog images (especially for fluvial / channelised systems)

2. Order-relation violations propagate through sequential conditioning

In multi-cutoff IK / multi-facies SISIM, a single bad probability set at one cell propagates into the conditioning set for every downstream cell. The drawn (possibly inconsistent) value becomes hard data thereafter. Per-cell correction (clip + renormalise) is essential, not just post-realisation cleanup.

3. Hidden non-stationarity is the most common cause of bad realisations

If the data have a proportion trend or anisotropy that the practitioner did not explicitly model, SISIM realisations will look wrong everywhere away from the data — even though the empirical variograms match perfectly. The fix is upstream: explicitly model the trend (VPC, proportion map) and run SISIM on residuals.

4. Neighbourhood mis-specification is a quiet bug

Too small a search neighbourhood: realisations regress toward the prior away from data. Too large: cost explodes and numerical issues appear. Standard heuristic: 16–32 conditioning points within ~1.5 ranges per cell. Random multiple-grid paths help reproduce large-scale structure first.

5. The two-point ceiling: indicator methods cannot reproduce multipoint geometry

Fluvial channels with sinuosity, deltaic clinoforms, dendritic vein systems — these features have multipoint connectivity. NO pairwise variogram (continuous or indicator) can capture them. SISIM with indicator variograms will produce realisations whose variograms match perfectly but whose channels are absent or broken into disconnected fragments. For these structures, multipoint statistics (MPS — SNESIM §9.2, FILTERSIM §9.3) are the appropriate tool.

Validation checklist (post-SISIM)

Variogram check: empirical indicator variogram of each realisation matches the model.
Histogram of global proportions: across realisations, the global facies fractions are centred near the prior with realistic spread.
Transition statistics: 1-step Markov transitions $P(j \mid i)$ from realisations match data-derived transitions.
Connectivity audit: number and size distribution of connected facies bodies match analog expectation.
Visual analog comparison: realisations LOOK like the analog system — the human pattern-matching cortex catches structural mismatches the math misses.
Cross-validation accuracy: leave-one-out CV of facies probabilities matches expected accuracy plots (cf. §6.2).

Decision tree: when to use which method

Symptom in the data	Method
Gaussian-looking continuous variable after normal-score transform	SGS (§7.3)
Non-Gaussian continuous variable, want exceedance probabilities	Multi-cutoff IK + SISIM (§8.2–§8.3)
2–4 ordered facies, simple transitions	TGSIM or SISIM with proportion trend (§8.4)
Multiple unordered facies, complex 2D adjacency	Plurigaussian (§8.4)
Sparse data, distinctive shapes (channels, lobes)	Object-based or MPS (§8.4, §9)
Dense data, complex multipoint connectivity	MPS / SNESIM with quality training image (§9.2)
Complex shapes, no good training image available	Object-based or generative ML (§9.4)

Try it

Defaults: scenario = stationary, IK range = 4.0. The TOP canvas shows the truth (top row) and 4 SISIM realisations. The BOTTOM canvas shows the indicator-variogram comparison: green = stationary IK model, red = empirical from truth, blue dashed = average across SISIM realisations.
Switch to PERIODIC scenario. The truth becomes regularly alternating bands (3 cycles). The SISIM realisations still match the GREEN model variogram (blue dashed close to green) — but they do NOT reproduce the periodic banding. The truth empirical variogram (red) now shows oscillations the model cannot capture; the variogram-only diagnostic looks plausible while the structure is wrong.
Switch to TREND scenario. The truth has a linear proportion trend (high prob of 1 on the left, low on the right). The SISIM realisations look stationary (no trend) because the model is stationary. The TRANSITION COUNT and POINTWISE MISMATCH readouts diverge from the variogram metric — multi-metric validation catches the failure.
Increase the IK range to 10. The SISIM realisations become much smoother (long sand/shale runs). The variogram model is now wrong even in the stationary scenario; you can see the green model curve no longer matches the red truth curve.
Click Resample. The truth, hard data, and realisations all change, but the structural mismatch in PERIODIC and TREND scenarios persists — it is a model-class limitation, not a sampling fluke.
Use the readout to compare: in STATIONARY scenario the pointwise mismatch is ~0.3 (random variation); in PERIODIC and TREND it is ~0.4–0.5 (systematic bias). The variogram-mismatch metric stays similar (~0.01–0.02) across all three.

You are reviewing a junior consultant's SISIM workflow for a fluvial reservoir. They show you that the empirical indicator variograms of their 50 realisations match the model perfectly. What three additional diagnostics would you require before signing off on the model?

What you now know

Indicator methods are distribution-free, cutoff-flexible, and categorical-native — the right tool for non-Gaussian and discrete spatial data with simple to moderate connectivity. They fail predictably on multipoint geometry (channels, dendrites) and on hidden non-stationarity (unmodelled trends, anisotropy). Variogram-only validation is necessary but not sufficient; production workflows multi-validate against variograms, transitions, connectivity, and visual analog comparison. When indicator methods are inadequate, the multipoint methods of §9 (SNESIM, FILTERSIM, generative ML) take over.

References

Strebelle, S. (2002). "Conditional simulation of complex geological structures using multiple-point statistics." Mathematical Geology 34(1), 1–21. (Multipoint motivation; the foundational SNESIM paper.)
Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford. (Indicator-method pitfalls.)
Deutsch, C.V., Journel, A.G. (1998). GSLIB, 2nd ed. Oxford. (sisim source code with validation utilities.)
Caers, J. (2011). Modeling Uncertainty in the Earth Sciences. Wiley. (Multi-statistic validation; structural metrics.)
Pyrcz, M.J., Deutsch, C.V. (2014). Geostatistical Reservoir Modeling, 2nd ed. Oxford. (Practitioner's checklist for facies model QC.)