Probabilistic facies classification: uncertainty at every voxel

Part 7, Reservoir Characterization & QI

Learning objectives

Explain why hard facies classification hides uncertainty and when that matters
Write Bayes’ rule for facies classification: P(class | elastic) ∝ P(elastic | class) · P(class)
Model per-class likelihoods as multivariate Gaussians in elastic space and compute posteriors
Use CONFIDENCE (max posterior) to identify voxels where classification is reliable
Turn probabilistic cubes into drilling outputs: P(gas sand) volumes, P(net pay) maps, risk-weighted STOIIP

In §7.4 you saw facies classification as a HARD argmax, each voxel assigned to its single most-likely class. That's visually clean, but it encodes a LIE: a voxel whose elastic attributes put it 51% on gas sand and 49% on oil sand gets mapped exactly the same as a voxel at 99% gas sand. No asset team should make a drill decision treating those two as equal.

§7.5 upgrades facies classification to a PROBABILISTIC framework. Instead of “voxel (ix, iy, iz) is gas sand”, the output is a SET of cubes: P_shale(ix, iy, iz), P_oil(ix, iy, iz), P_gas(ix, iy, iz), P_brine(ix, iy, iz). Each voxel carries its full posterior distribution over classes. Hard classification becomes a DERIVED PRODUCT (argmax + confidence mask), and the underlying probabilities enable proper uncertainty propagation into STOIIP, P(commercial), and drilling-risk calculations.

Bayes’ rule for facies classification

The mathematical heart of probabilistic classification is Bayes’ theorem:

P(c_k \mid \mathbf{d}) = \dfrac{P(\mathbf{d} \mid c_k) \cdot P(c_k)}{\sum_j P(\mathbf{d} \mid c_j) \cdot P(c_j)}

where $c_k$ is the k-th facies class (e.g., gas sand), $\mathbf{d}$ is the observed elastic data at the voxel (typically the vector (Ip, Vp/Vs) from inversion), $P(\mathbf{d} \mid c_k)$ is the LIKELIHOOD of seeing data $\mathbf{d}$ given the voxel is in class $c_k$ (modeled from well-log training data), and $P(c_k)$ is the PRIOR probability of class $c_k$ before seeing any data.

The denominator is a normalizer that makes the posteriors sum to 1. In practice:

The LIKELIHOOD is the most important piece. It’s built from the calibration wells: for each class, gather all well-log samples identified as that class, compute their (Ip, Vp/Vs) distribution, and fit a multivariate Gaussian (or kernel density) to represent $P(\mathbf{d} \mid c_k)$ .
The PRIOR encodes geological knowledge. If you know the reservoir is 80% shale and 20% sand, priors P_shale = 0.8 and P_sand = 0.2 reflect that. If you have no prior information, use UNIFORM priors (equal across classes). Many practical workflows use uniform priors and fold the geological prior into post-processing.
The POSTERIOR is what you store per voxel. From it you derive argmax (hard class), confidence (max posterior), entropy (overall ambiguity), and any derived-class-combination probabilities (e.g., P(hydrocarbon sand) = P(oil sand) + P(gas sand)).

Exercise, drag the probe across the crossplot

Open the widget in Class ellipses + probe mode. You see 4 colored ellipses, shale (grey), brine sand (blue), oil sand (orange), gas sand (pink). Each ellipse’s center is the CLASS MEAN on the (Ip, Vp/Vs) plane; the outer ring is 2σ (~95% of calibration samples); the filled inner region is 1σ (~68%).
Drag the BLACK PROBE to the center of the shale ellipse (Ip ≈ 9100, Vp/Vs ≈ 2.00), or click the Shale preset. The bar chart shows P(shale) ≈ 100%, all others near zero. HIGH CONFIDENCE, no ambiguity.
Click the Gas sand preset. Probe moves to (6300, 1.60). Bar chart shows P(gas sand) at near-100%, because gas sand is ISOLATED from the other clusters in elastic space. This is why gas sand is the most confidently-classified facies.
Click the Ambiguous preset. Probe moves to (7100, 1.76), a location where the brine sand and oil sand ellipses OVERLAP. Now the bar chart shows two comparable probabilities (roughly 50% brine, 50% oil). This voxel cannot be confidently classified, it could reasonably be either facies. Hard argmax would pick one, hiding the truth.
Switch to Confidence heatmap mode. The whole crossplot is painted by the MAX POSTERIOR probability. GREEN cells (near class centers) are confident, argmax is reliable. RED cells (at class-cluster boundaries) are UNCONFIDENT, argmax is a coin flip.
Drag the probe around the confidence map and watch the probabilities shift. Move along the line between brine sand and oil sand, see how probability smoothly transfers from one to the other as Vp/Vs rises. At the midpoint you get 50-50.
Interpretive rule of thumb: voxels with max posterior < 0.6 should be flagged LOW CONFIDENCE and either excluded from the hard-classification map or colored separately. §7.6 shows how this flag propagates into volumetric estimates.

The Gaussian likelihood model

The most common choice for $P(\mathbf{d} \mid c_k)$ is a multivariate Gaussian:

P(\mathbf{d} \mid c_k) = \dfrac{1}{(2\pi)^{n/2} |\Sigma_k|^{1/2}} \exp\!\left(-\dfrac{1}{2}(\mathbf{d} - \mu_k)^T \Sigma_k^{-1} (\mathbf{d} - \mu_k)\right)

where $\mu_k$ is the class-k mean (a vector in elastic space) and $\Sigma_k$ is the class-k covariance matrix. For 2D (Ip, Vp/Vs) classification the Gaussian is fit from the calibration wells:

Mean: the average (Ip, Vp/Vs) of all well-log samples labeled as that class.
Covariance: the 2×2 matrix of variances and cross-covariance. Captures BOTH the spread of each attribute AND any tilt/correlation (e.g., within sand cluster, lower porosity voxels tend to have both higher Ip AND higher Vp/Vs).

In 3D classification (Ip, Is, density) the same math applies with 3×3 matrices. In higher dimensions (adding attributes beyond the elastic) the Gaussian family is still usable but can over-fit with limited training data; that’s when you move to kernel density estimation or neural-net classifiers.

The widget uses a SIMPLIFIED case with diagonal covariances (no correlation between Ip and Vp/Vs axes within a class) for clarity. Real workflows use full covariance matrices fit from log data.

Derived products from the probabilistic cubes

Once you have per-voxel probabilities, many useful quantities are easy to compute:

Hard classification: argmax_k P(c_k | d). The “headline” map, but pair it with confidence.
Confidence: max_k P(c_k | d). A 0-1 map that flags ambiguous voxels.
Entropy: H = − Σ P log P. A single number per voxel quantifying TOTAL ambiguity (uniform = max entropy = log(K); single-class = 0). Useful for global quality metrics.
Combined-class probability: P(hydrocarbon) = P(oil sand) + P(gas sand). Most asset-team questions are about COMBINATIONS of classes, not single classes.
Expected volumetrics: φ_exp = Σ P(c_k) · φ_k. The expectation of porosity taking into account per-class uncertainty. This is the honest input to STOIIP.
Risk-weighted net pay: voxel counts as net pay WITH WEIGHT P(hydrocarbon) · P(φ > cut) · P(Vsh < cut). Produces a FRACTIONAL net pay map that preserves uncertainty all the way to the volume calculation.
Stochastic realizations: sample from the per-voxel posterior to produce multiple plausible facies cubes. Run reservoir simulation on each to generate a range of production forecasts.

Where priors come from (and how to use them properly)

The class PRIOR P(c_k) encodes geological knowledge BEFORE seeing the seismic. Common sources:

Lithological fractions from nearby wells: if existing wells show 70% shale / 30% sand in the target interval, prior P_shale = 0.7 and P_sand split among sand subclasses.
Depositional-system knowledge (Part 4): in a channel axis, sand proportion is higher than on the levee. Lateral priors can vary with mapped facies geometry.
Sequence-stratigraphic position (§4.2): LST-fan vs HST-shelfal, different prior expectations for sand vs shale.
Uniform priors: when no prior information is trustworthy, use equal weights. This makes the posterior equal to the normalized likelihood, i.e., classification is purely data-driven.

WARNING: priors can DOMINATE the posterior when the likelihood is weak (ambiguous data). A very strong prior (e.g., P_shale = 0.95) will keep the posterior on shale even when elastic data somewhat favors sand, that can be appropriate (strong prior knowledge) or a problem (suppressing real anomalies). Rule: always report posteriors with BOTH uniform priors and geological priors, and reconcile any large differences explicitly.

When probabilistic classification fails

Under-sampled training classes. A Gaussian likelihood with only 20 well-log samples has an unreliable covariance matrix. Posteriors near the edges of elastic space are unreliable. Mitigate: use Bayesian shrinkage (pool toward a common covariance), or switch to non-parametric (kernel density) likelihood.
Non-Gaussian clusters. Some facies (e.g., heterolithic thin beds) have bimodal or elongated distributions that a single Gaussian misrepresents. Fit a mixture model (Gaussian mixture per class) or kernel density.
Elastic-space overlap by design. If two classes genuinely occupy the same region of elastic space (e.g., two lithologically-different sands with identical Ip and Vp/Vs), no amount of probability math will separate them. Accept that posteriors will be ambiguous; add extra attributes (depth, mineralogy-sensitive Lame) to augment the feature space if available.
Miscalibrated priors. Using prior fractions from one basin in another basin produces systematic bias. Always calibrate priors per-basin and per-target interval.
Over-trusting the confidence map. High posterior confidence means the model THINKS it’s confident. If the model is wrong (bad likelihood, wrong priors), the confidence is meaningless. Calibrate confidence against blind wells: does P(gas sand) = 0.9 voxels actually contain gas sand 90% of the time? This is how you validate that confidence is meaningful.

Connecting to stochastic inversion (§7.3)

A sophisticated QI workflow integrates probabilistic facies with the stochastic pre-stack inversion from §7.3:

Pre-stack inversion produces NOT a single (Ip, Vp/Vs) per voxel, but a full POSTERIOR over elastic attributes, typically sampled as N=50-500 realizations.
For each realization, run the facies classifier to get per-class probabilities at every voxel.
Average the probabilities across realizations to get the FINAL facies posterior that accounts for BOTH inversion uncertainty AND classification uncertainty.

The math: $P(c_k \mid \text{seismic}) = \int P(c_k \mid \mathbf{d}) \cdot P(\mathbf{d} \mid \text{seismic}) , d\mathbf{d}$ . The first term is the facies classifier; the second is the inversion posterior. Monte-Carlo over realizations approximates the integral.

This integrated approach is the gold standard for high-value reservoir studies, it’s what you deliver when drill decisions hinge on getting uncertainty quantification exactly right.

Probabilistic facies classification is the honest-science upgrade of hard classification. It preserves all the information from the inversion and transforms, stops hiding uncertainty, and gives downstream decision-makers the raw material for proper risk analysis. §7.6 closes Part 7 by wrapping the full QI output, the probabilistic rock-property and facies cubes, into a RESERVOIR MODEL integrated with geological framework, volumetrics, and dynamic simulation.

References

Mavko, G., Mukerji, T., & Dvorkin, J. (2009). The Rock Physics Handbook (2nd ed.). Cambridge University Press.
Chopra, S., & Marfurt, K. J. (2007). Seismic Attributes for Prospect Identification and Reservoir Characterization. Society of Exploration Geophysicists.
Hilterman, F. (2001). Seismic Amplitude Interpretation. SEG/EAGE Distinguished Instructor Short Course.
Foster, D. J., Keys, R. G., & Lane, F. D. (2010). Interpretation of AVO anomalies. Geophysics, 75(5), 75A3-75A13.