Conditioning on hard data

Part 7 — Sequential Gaussian simulation

Learning objectives

Recognise hard data initialisation as the mechanism that enforces data honouring in SGS
Distinguish HARD data (exact known values) from SOFT data (uncertain observations)
Apply hard-data conditioning to multiple realisations and validate honouring
Recognise the convergence of conditional simulation: ensemble mean ≈ kriging map; variance ≈ kriging variance
Document the conditioning approach in a deployment report

The defining feature of CONDITIONAL simulation (vs unconditional) is that REALISATIONS HONOUR THE HARD DATA EXACTLY. §7.4 develops the conditioning mechanism in SGS and shows how multiple realisations together quantify spatial uncertainty.

Hard data initialisation

In SGS, data conditioning works by INITIALISATION:

Before simulation begins, include all HARD DATA in the "known" set.
Hard data are NEVER simulated — they are CONSTRAINTS on unsampled values.
At each unsampled node, kriging uses k nearest knowns (which include nearby hard data).
The kriging prediction is exact at hard data locations (kriging through them); the kriging variance is zero at hard data; sampling has zero variance contribution there.

Result: every realisation passes through every hard data point exactly.

Hard vs soft data

HARD data: exact known values at known locations. Wells, mining-drill assays, calibrated measurements. SGS conditions exactly on these.
SOFT data: noisy or uncertain observations. Seismic-derived properties, soft-classified maps. Treated via additional kriging or sequential techniques; honoured PROBABILISTICALLY, not exactly.

Modern geostat handles both: SGS for the primary hard data; soft data integrated via co-kriging or Bayesian frameworks. Pyrcz-Deutsch (2014) provides comprehensive treatment.

Validating hard-data conditioning

For each realisation, check that the simulated value at each hard-data location equals the hard data value within numerical tolerance (typically < 0.01% deviation). Modern SGS implementations (R gstat, Python pykrige, commercial software) auto-verify this.

Convergence of conditional simulation

For B realisations, the ENSEMBLE MEAN should converge to the kriging map; the ENSEMBLE VARIANCE should converge to the kriging variance. Specifically:

\text{mean}(z^{(b)}(x)) \to \hat{z}_K(x), \quad \text{Var}(z^{(b)}(x)) \to \sigma_K^2(x).

This convergence is the THEORETICAL FOUNDATION of using simulation realisations to characterise the conditional distribution. The kriging map IS the conditional mean; the kriging variance IS the conditional variance.

Practical check: compute ensemble mean from 100+ realisations; compare to kriging map. They should match within Monte Carlo error. Similarly for variance.

Spatial uncertainty visualisation

The ensemble ±2 SD band around the ensemble mean is the natural visualisation of spatial uncertainty:

At hard data points: SD = 0 (all realisations pass through).
Near hard data: small SD (kriging variance is small there).
Far from data: large SD (high spatial uncertainty).

The band collapses at data points and widens between. Modern deployment reports include this visualisation as a complement to the kriging map.

Try it

Default range = 2.0. 8 SGS realisations stacked on the same transect. ALL pass through the 6 hard data points (black dots).
The gray uncertainty band (ensemble ±2 SD) collapses at data points and widens between. This visualises spatial uncertainty in a way the kriging variance alone cannot.
Re-simulate. Same data + variogram, but new realisations. The data are honoured identically; the unsampled regions evolve.
Drop range to 0.5. Realisations are rougher; the uncertainty band is wider throughout. With short correlation, the model is more uncertain everywhere.
Crank range to 5.0. Realisations are smoother. Uncertainty band is tighter — long correlation means data inform a wider area.

A reservoir model has both hard well-log data AND soft seismic-derived porosity. Why can't SGS condition on both exactly, and what is the standard remedy?

What you now know

Hard-data conditioning is built into SGS initialisation. Each realisation honours data exactly. Multiple realisations together characterise spatial uncertainty. Ensemble mean → kriging map; ensemble variance → kriging variance. Soft data require additional machinery (co-kriging, Bayesian). §7.5 next: multiple realisations and uncertainty maps — how to post-process the ensemble for decision metrics.

References

Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford.
Pyrcz, M.J., Deutsch, C.V. (2014). Geostatistical Reservoir Modeling, 2nd ed. Oxford.
Deutsch, C.V., Journel, A.G. (1998). GSLIB, 2nd ed. Oxford.
Journel, A.G. (1989). "Fundamentals of geostatistics in five lessons." AGU Short Course, Vol. 8.
Caers, J. (2011). Modeling Uncertainty in the Earth Sciences. Wiley-Blackwell. (Modern soft-data integration.)