ML for FWI gradient acceleration
Learning objectives
- Identify the three points where ML can accelerate FWI: initialisation, regularisation, surrogate forward
- Explain why each kind of ML acceleration changes convergence but preserves physics
- Quantify the speed-up achievable with ML-initialised and ML-regularised FWI
- Recognise the production caveats: training-distribution match, transferability, QC
Part 9 closes with the current ML frontier in production seismic processing: accelerating FWI. Unlike the previous four sections where ML was the main algorithm, FWI acceleration uses ML alongside physics. The forward and adjoint wave-equation simulations remain the ground truth; ML reduces the number of times you have to run them or the number of iterations they must do.
1. Three entry points for ML
- Learned initial models. A CNN maps tomographic velocity models (or legacy seismic volumes) to FWI-ready starting models. Training pairs: (tomography output, converged FWI output) from historical projects. The CNN "jumps" the model part-way toward the converged state, reducing the initial misfit.
- Learned regularisers. Instead of a hand-tuned TV or smoothness penalty, train a network to recognise "geologically plausible" velocity models and apply it as a soft constraint in the FWI loss. Tends to produce cleaner results with fewer iterations.
- Surrogate forward modelling. Train a network to mimic the forward simulator. Inference is 10–100× faster than a real finite-difference simulation, but accuracy degrades for model perturbations far from the training distribution. Used for rapid Monte Carlo uncertainty estimation; not (yet) a full substitute for real FWI.
2. The widget — convergence comparison
Three convergence curves on log-scale misfit vs iteration:
- Pure FWI (red). Starting misfit 1.0 (normalised), exponential decay rate 3 % per iteration. Needs ~130 iterations to reach 2 % misfit.
- ML-init FWI (yellow). Starting misfit 0.45 (ML starting model is partway to truth). Same decay rate. Reaches 2 % much sooner.
- ML-init + ML-reg (green). Same low start + steeper decay (learned regulariser keeps each iteration's update pointed in a plausible direction). Fastest.
Sliders control ML starting-model quality (0–1) and ML regulariser strength (0–1). The info strip reports iterations-to-target for each and the total speedup factor.
3. What ML cannot do for FWI
- Replace adjoint gradients. The physics-based adjoint gives the exact gradient of the misfit with respect to the model. An ML surrogate gradient is approximate; using it as the only gradient produces biased FWI. Production workflows use real adjoints for the final iterations even if ML accelerates early iterations.
- Invent geology. A CNN trained on North Sea datasets cannot produce a correct starting model for a Gulf of Mexico sub-salt project. Domain shift is a hard limit.
- Remove cycle-skipping risk. A bad ML initial model can still be cycle-skipped relative to the data. Multi-scale FWI (§6.2) is still required.
- Guarantee amplitude fidelity. An ML regulariser may bias amplitudes toward training-distribution statistics; QI-grade FWI still needs post-hoc amplitude calibration.
4. Production deployment patterns
- Initialiser only: ML generates the starting model; pure physics FWI from then on. Safe and widely deployed.
- Initialiser + learned prior: ML starting model + TV-style learned regulariser for the first half of the iterations, then pure physics regularisation for the final iterations (to avoid amplitude bias). More aggressive; common in research projects.
- Full ML FWI: surrogate forward + learned gradient + learned prior. Highly efficient but requires strong QC and limited to training-similar projects. Emerging in time-lapse monitoring where baseline FWI has already been done carefully.
5. Quantitative expectations
Typical production numbers (from field reports and vendor benchmarks, circa 2024–2025):
- ML starting model alone: 1.5–3× fewer FWI iterations to converge.
- ML starting + ML regulariser: 2–5× fewer iterations.
- Surrogate forward model (research): 10–100× throughput on Monte-Carlo estimation tasks; not currently trusted for final FWI models.
- 4D monitoring with ML-init FWI: convergence in days instead of weeks, enabling more frequent re-inversions.
6. QC for ML-accelerated FWI
- Compare final model to pure-physics FWI. On a subset of projects, run both pure and ML-accelerated FWI to verify they converge to the same model. If they disagree, diagnose whether the ML is biasing the solution.
- Well ties on final model. Compare ML-FWI output to wells; require the same tie quality as pure FWI.
- Synthetic-vs-recorded forward tests (§6.5). The pure-physics criterion: if the forward-modelled data from the final model matches observed data, the ML acceleration did no harm.
- Cross-survey generalisation test. Re-run the ML-FWI on a survey not seen in training; verify no systematic bias appears.
ML accelerates FWI by providing learned starting models, learned regularisers, and (experimentally) surrogate forward operators — producing 2–5× speed-ups without replacing the physics-based adjoint gradients that give FWI its mathematical guarantees.
Part 9 closes here
You have the ML-in-seismic-processing toolkit: where ML fits (§9.1), CNN denoising (§9.2), trace interpolation (§9.3), first-break picking (§9.4), and FWI acceleration (§9.5). The theme across all five: ML is most valuable as a complement to physics, not a replacement — classical methods provide the guarantees, ML provides the speed and accuracy at pattern-recognition tasks. Part 10 brings all nine previous parts together in a series of end-to-end processing capstones that walk through complete projects from raw data to final deliverable.
References
- Virieux, J., Operto, S. (2009). An overview of full-waveform inversion in exploration geophysics. Geophysics, 74, WCC1.
- Pratt, R. G. (1999). Seismic waveform inversion in the frequency domain, Part 1. Geophysics, 64, 888.
- Tarantola, A. (1984). Inversion of seismic reflection data in the acoustic approximation. Geophysics, 49, 1259.
- Etgen, J., Gray, S. H., Zhang, Y. (2009). An overview of depth imaging in exploration geophysics. Geophysics, 74, WCA5.