Marmousi velocity inversion (the field-standard testbed)

Part 10 — Real-field capstones

Learning objectives

Scale Part 9 machinery from 3-param 2-layer to 7-param 4-layer
Race classical FWI vs PINN-augmented vs bootstrap-posterior on the same target
Visualise recovered velocity models as 2-D heatmaps (production output style)
Quantify per-parameter posterior σ for layer velocities and interface depths
Draw the line from this 7-parameter capstone to real Marmousi (10⁵-10⁶ params)

Part 10 opens with the FWI capstone: take everything from Part 9 — physics-informed regularisation (§9.2), bootstrap-posterior UQ (§9.6) — and apply it to a Marmousi-style benchmark. The original Marmousi (Versteeg 1994) is a 9.2 km × 3 km synthetic with 10⁵+ velocity unknowns and a complex anticlinal flank. The widget below scales the Part 9 machinery up to a 7-parameter 4-layer simplification — closer to production than the 3-parameter 2-layer toys, but still fast enough to converge in your browser tab.

Why Marmousi matters

Marmousi was introduced as a synthetic FWI benchmark precisely BECAUSE classical FWI struggles on it. The folded sediment flank produces multi-pathing and cycle-skipping artifacts; the velocity contrasts are sharp enough that a smooth starting model leaves you trapped in a local minimum at every iteration. The benchmark established that:

Without a good starting model, classical FWI diverges or converges to wrong velocities.
Frequency-continuation strategies (start at 2 Hz, march up to 30 Hz) help but are expensive.
Regularisation that encodes geological priors (depth-trend, smoothness, sparsity) reduces local-minimum trapping.
Modern hybrid PINN-FWI workflows combine all three: smart init + physics regulariser + UQ envelope.

The 7-parameter setup

Truth: 4 monotonically-increasing layers with

v_{\mathrm{truth}} = (1.5, 2.0, 2.7, 3.5)\ \text{km/s},\quad z_{\mathrm{truth}} = (0.25, 0.55, 0.80)\ \text{km}.

Domain 2 km × 1 km at 51 × 26 grid resolution. A single source at the top centre, 16 surface receivers spread along the bottom edge, picking noise $\sigma_{\mathrm{pick}} = 0.015$ s. The starting model used by every method is a "smart" 1-D layered guess that captures the gross trend but is BIASED — interfaces too shallow, layer velocities too low, mimicking the typical mistake a regional geologist makes when bootstrapping FWI.

Three inversions race head-to-head

Classical FWI (λ = 0): pure data misfit, 50 iterations of finite-difference gradient descent on the 7 parameters.
PINN-augmented FWI (λ = 2.5): same optimiser + a depth-trend prior $\mathcal{L}$ that penalises non-monotone velocity sequences AND pulls each adjacent contrast toward a FIXED expected gradient $\Delta v$ {\mathrm{prior}} = 0.7 $Δ v_{prior} = 0.7$ km/s (a typical regional well-log estimate, set independent of truth) AND softly anchors the deepest layer velocity to $v_{\mathrm{deep}} \approx 3.2$ km/s. The independence-from-truth matters: in production the prior comes from regional geology, not the answer.
Bootstrap-posterior FWI: 4 samples of the §9.6 recipe (init jitter $\sigma_{\mathrm{init}}$ on every parameter + re-sampled noise on $t_{\mathrm{obs}}$ ), each with 50 iterations of PINN-augmented inversion. Empirical posterior mean is the central estimate; per-parameter σ is the uncertainty.

Try it

Four panels (read row-by-row, top-left → bottom-right):

Truth velocity model. Yellow star = source, cyan dots = 16 surface receivers. Viridis colormap (purple → yellow) maps velocity from 1.4 to 3.6 km/s. Layer interfaces visible as colour transitions.
Classical FWI recovery. After 50 iters from the smart init. Note where layers are misplaced or velocities are off — these are the local-minimum signatures classical FWI cannot escape.
PINN-augmented FWI recovery. Same optimiser, plus the depth-trend prior. The prior pulls toward feasible monotone sequences and typically corrects most of the classical-FWI bias.
Bootstrap posterior mean. Average of 4 PINN-augmented inversions with init/data jitter. The summary box reports per-parameter σ — uncertainty bars for each layer velocity and interface depth.

Below the panels, a summary line reports parameter L² error vs truth for each method. Expected behaviour: PINN-augmented beats classical by 1.5-3× on most seeds; the posterior mean tracks PINN-augmented closely with calibrated σ that flags which layers the data constrains well (top sediments) vs less well (deeper layers, where ray density is lower).

From 7 to 10⁶ parameters

Real Marmousi has roughly $10^5$ - $10^6$ velocity unknowns (every grid cell is a free parameter, with smoothness regularisation tying them). The leap from 7 to $10^6$ parameters is large but the principles transfer 1-to-1:

Finite-difference gradient → adjoint-state. At $10^6$ params you can't afford 2 forward solves per param. Production FWI uses the adjoint-state method (compute gradient via one extra forward solve, regardless of param count). PINN extensions use automatic differentiation through the network.
Smart init → frequency-continuation + tomography starter. Pre-FWI traveltime tomography (often via PINN eikonal, §7) produces a smooth velocity model that becomes the FWI starting point. Frequency continuation marches from 2 Hz → 30 Hz, exploiting the broader convergence basin at low frequencies.
Hand-crafted prior → learned prior. The depth-trend prior used here generalises to learned regularisers (§9.3) and generative priors (§9.4): autoencoders or diffusion models trained on a corpus of plausible velocity models replace hand-coded constraints.
4-sample bootstrap → 100-1000 sample MCMC. Production UQ uses thousands of samples — embarrassingly parallel — and inflates intervals 1.5-3× to compensate for residual model-form uncertainty.

What still breaks at scale

The 7-parameter widget converges nicely. Real Marmousi at $10^6$ parameters has pathologies that don't appear at this scale:

Cycle skipping at high frequencies. If the starting model is more than half a wavelength off truth, every iteration moves AWAY from truth. Frequency continuation + envelope-based misfit functions (Bozdağ et al 2011) mitigate.
Memory blowup on adjoint-state. Storing forward wavefields for backpropagation is GB-scale. Checkpointing (Symes 2007) trades compute for memory.
Anisotropy and visco-elastic effects. Real Marmousi is acoustic isotropic. Real subsurface is anisotropic + attenuating. Adding these to FWI is an active research front; the Operto et al 2015 paper cited below covers production multi-parameter FWI.

What §10.2 will do

§10.2 pivots from active-source FWI to PASSIVE seismology — microseismic event monitoring during a hydraulic-fracturing stage in the Marcellus shale. Same physics-informed inversion machinery, applied to locating earthquakes inside an ongoing frac job.

References

Versteeg, R. (1994). The Marmousi experience: Velocity model determination on a synthetic complex data set. The Leading Edge 13(9), 927–936. Original Marmousi paper.
Brossier, R., Operto, S., Virieux, J. (2009). Seismic imaging of complex onshore structures by 2D elastic frequency-domain full-waveform inversion. Geophysics 74(6), WCC105–WCC118. Marmousi-II FWI on real-style geometry.
Operto, S., Miniussi, A., Brossier, R., Combe, L., Métivier, L., Monteiller, V., Ribodetti, A., Virieux, J. (2015). Efficient 3-D frequency-domain mono-parameter full-waveform inversion of ocean-bottom cable data. Geophys. J. Int. 202(2), 1362–1391. Production-scale FWI methodology.
Bozdağ, E., Trampert, J., Tromp, J. (2011). Misfit functions for full waveform inversion based on instantaneous phase and envelope measurements. GJI 185(2), 845–870. Cycle-skipping mitigation.
Symes, W.W. (2007). Reverse time migration with optimal checkpointing. Geophysics 72(5), SM213–SM221. Memory-vs-compute trade in adjoint-state FWI.