ML for FWI gradient acceleration

Part 9 — Machine Learning in Processing

Learning objectives

  • Identify the three points where ML can accelerate FWI: initialisation, regularisation, surrogate forward
  • Explain why each kind of ML acceleration changes convergence but preserves physics
  • Quantify the speed-up achievable with ML-initialised and ML-regularised FWI
  • Recognise the production caveats: training-distribution match, transferability, QC

Part 9 closes with the current ML frontier in production seismic processing: accelerating FWI. Unlike the previous four sections where ML was the main algorithm, FWI acceleration uses ML alongside physics. The forward and adjoint wave-equation simulations remain the ground truth; ML reduces the number of times you have to run them or the number of iterations they must do.

1. Three entry points for ML

  • Learned initial models. A CNN maps tomographic velocity models (or legacy seismic volumes) to FWI-ready starting models. Training pairs: (tomography output, converged FWI output) from historical projects. The CNN "jumps" the model part-way toward the converged state, reducing the initial misfit.
  • Learned regularisers. Instead of a hand-tuned TV or smoothness penalty, train a network to recognise "geologically plausible" velocity models and apply it as a soft constraint in the FWI loss. Tends to produce cleaner results with fewer iterations.
  • Surrogate forward modelling. Train a network to mimic the forward simulator. Inference is 10–100× faster than a real finite-difference simulation, but accuracy degrades for model perturbations far from the training distribution. Used for rapid Monte Carlo uncertainty estimation; not (yet) a full substitute for real FWI.

2. The widget — convergence comparison

Ml Fwi DemoInteractive figure — enable JavaScript to interact.

Three convergence curves on log-scale misfit vs iteration:

  • Pure FWI (red). Starting misfit 1.0 (normalised), exponential decay rate 3 % per iteration. Needs ~130 iterations to reach 2 % misfit.
  • ML-init FWI (yellow). Starting misfit 0.45 (ML starting model is partway to truth). Same decay rate. Reaches 2 % much sooner.
  • ML-init + ML-reg (green). Same low start + steeper decay (learned regulariser keeps each iteration's update pointed in a plausible direction). Fastest.

Sliders control ML starting-model quality (0–1) and ML regulariser strength (0–1). The info strip reports iterations-to-target for each and the total speedup factor.

3. What ML cannot do for FWI

  • Replace adjoint gradients. The physics-based adjoint gives the exact gradient of the misfit with respect to the model. An ML surrogate gradient is approximate; using it as the only gradient produces biased FWI. Production workflows use real adjoints for the final iterations even if ML accelerates early iterations.
  • Invent geology. A CNN trained on North Sea datasets cannot produce a correct starting model for a Gulf of Mexico sub-salt project. Domain shift is a hard limit.
  • Remove cycle-skipping risk. A bad ML initial model can still be cycle-skipped relative to the data. Multi-scale FWI (§6.2) is still required.
  • Guarantee amplitude fidelity. An ML regulariser may bias amplitudes toward training-distribution statistics; QI-grade FWI still needs post-hoc amplitude calibration.

4. Production deployment patterns

  • Initialiser only: ML generates the starting model; pure physics FWI from then on. Safe and widely deployed.
  • Initialiser + learned prior: ML starting model + TV-style learned regulariser for the first half of the iterations, then pure physics regularisation for the final iterations (to avoid amplitude bias). More aggressive; common in research projects.
  • Full ML FWI: surrogate forward + learned gradient + learned prior. Highly efficient but requires strong QC and limited to training-similar projects. Emerging in time-lapse monitoring where baseline FWI has already been done carefully.

5. Quantitative expectations

Typical production numbers (from field reports and vendor benchmarks, circa 2024–2025):

  • ML starting model alone: 1.5–3× fewer FWI iterations to converge.
  • ML starting + ML regulariser: 2–5× fewer iterations.
  • Surrogate forward model (research): 10–100× throughput on Monte-Carlo estimation tasks; not currently trusted for final FWI models.
  • 4D monitoring with ML-init FWI: convergence in days instead of weeks, enabling more frequent re-inversions.

6. QC for ML-accelerated FWI

  • Compare final model to pure-physics FWI. On a subset of projects, run both pure and ML-accelerated FWI to verify they converge to the same model. If they disagree, diagnose whether the ML is biasing the solution.
  • Well ties on final model. Compare ML-FWI output to wells; require the same tie quality as pure FWI.
  • Synthetic-vs-recorded forward tests (§6.5). The pure-physics criterion: if the forward-modelled data from the final model matches observed data, the ML acceleration did no harm.
  • Cross-survey generalisation test. Re-run the ML-FWI on a survey not seen in training; verify no systematic bias appears.
**The one sentence to remember**

ML accelerates FWI by providing learned starting models, learned regularisers, and (experimentally) surrogate forward operators — producing 2–5× speed-ups without replacing the physics-based adjoint gradients that give FWI its mathematical guarantees.

Part 9 closes here

You have the ML-in-seismic-processing toolkit: where ML fits (§9.1), CNN denoising (§9.2), trace interpolation (§9.3), first-break picking (§9.4), and FWI acceleration (§9.5). The theme across all five: ML is most valuable as a complement to physics, not a replacement — classical methods provide the guarantees, ML provides the speed and accuracy at pattern-recognition tasks. Part 10 brings all nine previous parts together in a series of end-to-end processing capstones that walk through complete projects from raw data to final deliverable.

References

  • Virieux, J., Operto, S. (2009). An overview of full-waveform inversion in exploration geophysics. Geophysics, 74, WCC1.
  • Pratt, R. G. (1999). Seismic waveform inversion in the frequency domain, Part 1. Geophysics, 64, 888.
  • Tarantola, A. (1984). Inversion of seismic reflection data in the acoustic approximation. Geophysics, 49, 1259.
  • Etgen, J., Gray, S. H., Zhang, Y. (2009). An overview of depth imaging in exploration geophysics. Geophysics, 74, WCA5.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.