Full-waveform inversion: matching observed to modeled waveforms

Part 8 — Advanced QI Topics

Learning objectives

  • Explain what FWI solves: the full wave equation, iterative gradient updates
  • Recognize the role of the STARTING MODEL in FWI success vs failure
  • Understand CYCLE SKIPPING: the dominant FWI failure mode
  • Apply MULTI-SCALE strategy: low-frequency first, progressively refine
  • Know when FWI is worth running vs when simpler inversion methods (§7.3) suffice

Until §8.4, every inversion in this textbook (§7.3, §8.2) has relied on the CONVOLUTIONAL MODEL: seismic = reflectivity * wavelet. That model is a useful simplification but ignores much of the physics of real wave propagation — refraction, diffraction, wavefield complexity, attenuation, anisotropy. Full-waveform inversion (FWI) throws away the convolutional approximation and inverts the observed seismic waveforms against simulations of the FULL WAVE EQUATION. The payoff: SUPER-RESOLUTION velocity models that capture geological structure at wavelengths an order of magnitude shorter than conventional tomography.

FWI has become the standard in salt basins (Gulf of Mexico, Brazil pre-salt, West Africa), complex structural provinces (foothills, sub-thrust), and increasingly as a routine component of every major 3D processing project. The cost is compute: modern FWI runs on cluster-scale hardware, with a single field-scale project consuming tens of thousands of GPU-hours. The reward is a velocity model that shows features — narrow salt canyons, thin shale layers, fluid-filled fault zones — that no other method can resolve.

What FWI solves

FWI is a LEAST-SQUARES OPTIMIZATION problem. Given:

  • Observed seismic data d_obs (all traces, all times, all offsets)
  • Source wavelet estimate
  • A starting velocity model V₀

FWI finds the velocity model V that MINIMIZES the misfit between observed and simulated data:

**

E(V)=12s,r,t(dobs(s,r,t)dsim(V;s,r,t))2\mathcal{E}(V) = \tfrac{1}{2} \sum_{s,r,t} (d_\text{obs}(s,r,t) - d_\text{sim}(V; s,r,t))^2

**

where dsim(V)d_\text{sim}(V) is computed by solving the wave equation (full finite-difference or finite-element simulation) for the current velocity model V. The sum is over all sources s, all receivers r, all time samples t.

The problem is massive: at field scale, you have thousands of sources, tens of thousands of receivers, tens of thousands of time samples. d_obs is a multi-terabyte dataset; forward modeling is expensive (each iteration simulates the wavefield for every source); the velocity model has millions of voxels. This is one of the most computationally demanding inversions in any field of science.

FwiInteractive figure — enable JavaScript to interact.

Exercise — walk the iteration slider through three scenarios

  • Open in Good starting model mode, iteration = 0. You see: TRUE profile (thick yellow step-curve), STARTING profile (grey dashed, a smoothed version of truth), CURRENT iteration (magenta, currently identical to starting). The starting is close to truth but missing the sharp layer boundaries.
  • Slide iteration up to 4. The current profile (now trending toward green) is noticeably closer to the yellow truth. Misfit curve below shows the misfit dropping.
  • Iteration 10: current profile (bright green) almost matches truth. Layer boundaries are sharp. Misfit has dropped by ∼85%. This is SUCCESSFUL convergence.
  • Iteration 20: fully converged; current is indistinguishable from truth. Misfit is near zero. This is what a GOOD FWI project produces.
  • Switch to Bad starting model. Starting profile (grey dashed) is now a LINEAR GRADIENT — very different from the layered truth. Iteration 0: current matches starting. Iteration 4: the profile begins moving but imperfectly. Iteration 10: the profile has STALLED at a wrong intermediate state. Iteration 20: unchanged. This is CYCLE SKIPPING.
  • Look at the misfit curve for the bad mode: drops initially, then LEVELS OFF at a wrong, non-zero value. The inversion THINKS it has converged (misfit stopped decreasing) but to a WRONG answer. This is why FWI results must always be validated against wells.
  • Switch to Multi-scale FWI. Starting model is the same bad linear gradient. Iterations 0-5: the current profile moves toward a SMOOTH version of truth (low-frequency FWI recovers long-wavelength structure). Iterations 6+: the profile sharpens into the true layer boundaries (high-frequency FWI adds detail). By iteration 20, converged. This is how multi-scale RESCUES a bad starting model.
  • Key lesson from flipping between "far" and "multi": the only difference is the FREQUENCY STRATEGY. Both start from the same bad model, but multi-scale succeeds where single-scale fails. This is why modern broadband acquisition (down to 3-5 Hz) matters so much — it enables robust multi-scale FWI.

Cycle skipping: the FWI failure mode

FWI uses GRADIENT DESCENT: at each iteration, compute the gradient of the misfit w.r.t. the velocity model, take a step DOWNHILL. Gradient descent converges to the nearest local minimum, not necessarily the global (true) minimum.

When does the nearest local minimum = the global minimum? When the simulated waveforms overlap the observed waveforms BY MORE THAN HALF A WAVELENGTH. This is the "capture zone" for gradient-based FWI. If your starting model simulates waveforms shifted by MORE than half a wavelength at the dominant frequency, you’re OUTSIDE the capture zone — FWI goes to the wrong local minimum. That’s cycle skipping.

Practically: at f = 10 Hz with Vp = 3000 m/s, wavelength = 300 m. Half-wavelength = 150 m. If the traveltime error in your starting model exceeds 1/20 second (150/3000 = 0.05 s), FWI cycle-skips. You need a DECENT starting velocity model.

Sources of the starting model: (1) traditional NMO-based velocity analysis from stacked seismic (moderate quality, but fine for shallow sections); (2) reflection tomography (detailed smooth model); (3) regional geologic models (long-wavelength constraints); (4) interpolated well velocities; (5) previous cycle of FWI + manual editing. Modern FWI workflows spend 30-50% of the project time on preparing the starting model.

Multi-scale FWI: the industry standard defense

Cycle skipping is a FREQUENCY-DEPENDENT problem. At low frequencies (long wavelengths), the half-wavelength tolerance is large — starting models that would cycle-skip at 20 Hz are safely within the capture zone at 5 Hz. This insight is what makes MULTI-SCALE FWI possible:

  • Low-frequency stage: filter the data to the lowest usable frequencies (typically 3-5 Hz band). Run FWI. The LONG-WAVELENGTH velocity structure is recovered without cycle skipping even from a poor starting model.
  • Progressive refinement: expand the frequency band in stages (5-8 Hz, then 8-15 Hz, then 15-25 Hz, etc.). Each stage starts from the converged output of the previous — which is already close to truth at those frequencies.
  • High-frequency stage: the final stage runs at the full data bandwidth, adding the fine-scale detail.

This is WHY modern marine 3D acquisition has pushed for LOWER-FREQUENCY CONTENT. Broadband sources (BroadSeis, IsoMetrix, Broadband Plus) specifically target the 3-8 Hz band that FWI needs for robust multi-scale starting. Land acquisition uses vibrators with low-frequency sweeps or dynamite with high-output sources for similar reasons.

When FWI is worth running

  • Complex structural settings: salt basins, sub-salt targets, sub-thrust plays. Conventional tomography cannot resolve the complex velocity structure. FWI delivers detailed velocity models that feed into accurate depth migration.
  • Near-surface problems: unconsolidated weathering zones, karst, permafrost. FWI can image the shallow complexity that degrades deeper imaging.
  • High-value targets where imaging matters: billion-barrel subsalt prospects; CO2 storage pilot projects where precise velocity models enable quantitative monitoring.
  • Velocity-model refinement for QI: traditional tomography gives smooth velocity; FWI adds detail that improves pre-stack migration and downstream QI inversion.

When FWI may not be worth it: (1) simple basin geometry where tomography + kirchhoff migration already gives good results; (2) very noisy data where FWI amplifies noise more than signal; (3) budget-constrained projects where the compute cost exceeds the imaging value. For most modern large projects, FWI is now the default — the question is how many iterations and what frequency bandwidth, not whether to run it.

FWI variants and extensions

  • Acoustic vs elastic FWI: acoustic assumes only P-waves (simpler, faster, used for velocity-model building). Elastic FWI models P and S waves plus density (more accurate, expensive, used for QI-grade outputs).
  • Anisotropic FWI: includes Thomsen ε, δ, γ in the forward model. Essential in basins with VTI shales or HTI fractured reservoirs (§8.3).
  • Viscoacoustic / viscoelastic FWI: includes attenuation (Q). Important in shallow gas-bearing sediments where attenuation is severe.
  • Envelope-based FWI: minimizes the misfit of the wavefield ENVELOPE rather than the waveform itself. Less cycle-skipping-prone; used as a starter for traditional FWI.
  • Optimal-transport FWI: uses optimal-transport distance between waveforms instead of L2 norm. Highly robust to cycle skipping but more expensive.
  • Time-domain vs frequency-domain: time-domain is more flexible for complex geology; frequency-domain is efficient for narrowband sequential inversion. Modern FWI uses time-domain with frequency selection.

FWI is the most computationally demanding but physically-honest inversion in all of reflection seismology. For velocity-model building, it has become indispensable in complex basins. For QI, it’s an emerging but powerful tool that refines the elastic properties used in Part 7 workflows. §8.5 takes a different approach: MACHINE-LEARNING QI. Rather than explicit wave-equation inversion, neural networks learn the mapping from data to properties directly — a paradigm that’s rapidly changing what’s possible in quantitative seismic interpretation.

References

  • Aki, K., & Richards, P. G. (2002). Quantitative Seismology (2nd ed.). University Science Books.
  • Yilmaz, Ö. (2001). Seismic Data Analysis (2 vols.). Society of Exploration Geophysicists.
  • Sheriff, R. E., & Geldart, L. P. (1995). Exploration Seismology (2nd ed.). Cambridge University Press.
  • Mavko, G., Mukerji, T., & Dvorkin, J. (2009). The Rock Physics Handbook (2nd ed.). Cambridge University Press.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.