Defending an inversion run: convergence diagnostics

Part 6 — Velocity inversion with PINNs

Learning objectives

Recognise the four production-FWI convergence diagnostic signals
Read healthy convergence vs cycle-skipping vs step-size-limited from the joint shape
Distinguish 'data fit improved' from 'model is correct'
Build the defence-of-inversion routine you need before publishing FWI results

Capstone of Part 6. The §6.2–§6.8 widgets all converged to truth on the toy 1D problem. In production FWI on real data, you do NOT have ground-truth velocity to compare against. You must judge whether the inversion succeeded from the available signals — and the data-fit residual reduction is necessary but not sufficient.

The four production-FWI diagnostics

$J$ vs iteration. The misfit must DECREASE. In a well-behaved inversion, $J$ falls roughly exponentially in early iterations and plateaus near the optimum. A flat $J$ from iteration 1 means line search is failing (cycle-skipped or step too small). A growing $J$ means divergence (step too aggressive or sign error).
$|c - c_{\mathrm{init}}|$ vs iteration. The model-update magnitude. Early iterations should show large updates; late iterations should taper. A model that doesn't move at all has a failed line search; a model that keeps moving never converges.
$|\nabla J|$ vs iteration. The gradient norm. AT a local minimum $|\nabla J| = 0$ . A non-zero plateau in $|\nabla J|$ means the inversion is at a saddle or step-size-limited (not at a minimum). A growing $|\nabla J|$ with shrinking $J$ usually means a parameter blow-up.
Line-search step size vs iteration. The actual step taken per iteration. Healthy convergence: grows in early iterations, plateaus, may shrink near the optimum. Cycle-skipping: shrinks to zero immediately as line search fails. Bad step₀: oscillates between two scales.

Reading the four signals together

The diagnostic is not in any single trace; it is in their JOINT shape:

Healthy convergence: J↓ exponentially, model-error↓, |∇J|↓, step plateaus or grows.
Cycle-skipped: J flat, model-error flat, |∇J| flat (gradient sign wrong, magnitude unchanged), step → 0 (line search fails). The §6.4 / §6.5 cures apply.
Stuck at saddle: J slightly down, model-error slightly down, |∇J| flat at non-zero, step shrinking. Need step-size-control adjustment or quasi-Newton (L-BFGS).
Found wrong minimum: J↓, but model-error UP. Data fit improved by going AWAY from truth. The §6.6 cross-talk valley case. Multi-shot illumination needed.

Try it

Pick a starting c₂ and watch all four signals together:

$c_2 = 1.8$ or 2.0: healthy. All four traces tell the same story.
$c_2 = 0.7$ or 2.3: cycle-skipped. J flat or noisy, |∇J| flat, step → 0.
$c_2 = 1.55$ (close to truth): healthy and fast. Three iterations, done.

The verdict text auto-classifies the inversion outcome from the joint signal shape. In production this kind of automated diagnostics IS the inversion-quality controller — automated codes flag suspect runs for human review.

The defence-of-inversion checklist

Before publishing or shipping an FWI result:

$J$ reduced by ≥ 1 order of magnitude (robust); ideally ≥ 2 orders for clean acoustic data.
Final |∇J| at least 1 order smaller than initial.
Model-update trajectory smooth (no oscillations or jumps).
Line-search step well-behaved (no shrink-to-zero cliffs).
Final velocity model passes geological sanity check (no negative velocities, no high-frequency noise above seismic resolution, no impossibly fast layers).
Synthetic shot records from the inverted model overlay the observed data well at all receivers.
Holdout test: leave one shot out, invert on the rest, predict the holdout shot from the inverted model. Mismatch should be at the data-noise level.
Resolution analysis: the model is only resolved at the spatial scales the data illuminates. State this.

Closing of Part 6

You now have the complete classical-FWI / PINN-FWI workflow:

§6.1 The adjoint-state gradient. The atom of FWI.
§6.2 The iteration loop. Plessix gradient + line search + Tikhonov, applied repeatedly.
§6.3 Marmousi as the reference benchmark; smooth starting model construction.
§6.4 Frequency continuation as the curriculum cure for cycle skipping.
§6.5 Alternative misfits (envelope, Wasserstein, AWI) as the misfit-side cure.
§6.6 Multi-parameter cross-talk and its remedies (multi-shot, joint inversion, impedance reparameterisation).
§6.7 Source encoding: 100× compute saving via stochastic super-shots.
§6.8 Loss-weight balance: the Tikhonov / NTK / SA-PINN auto-tuning trio.
§6.9 Convergence diagnostics: how to know your inversion succeeded.

Part 7 (next) extends to TRAVEL-TIME and SURFACE-WAVE inversion: simpler PDE (eikonal), different physics, complementary FWI tooling. Part 8 covers OPERATOR LEARNING (DeepONet, FNO) — neural networks that learn the FORWARD wave-equation operator and provide GPU-friendly surrogates. Part 9 hybridises FWI + classical and adds Bayesian uncertainty quantification.

Expertise checkpoint — end of Part 6

You should now be able to:

Derive the adjoint-state gradient $\partial J / \partial c(x) = -(2/c^3) \int u_{tt} \lambda dt$ from scratch.
Implement a classical FWI iteration loop in 100 lines of code (you have, in §6.2).
Read a Marmousi-class velocity model and recognise the structural ingredients that defeat naive FWI.
Apply frequency continuation, envelope misfit, source encoding, and loss-weight tuning to specific failure modes you can diagnose from the convergence traces.
Defend an FWI inversion to a sceptical audience using the four-diagnostic recipe.
Read a PINN-FWI paper (Sun-Alkhalifah, Rasht-Behesht, Song-Alkhalifah, Yang et al.) and place its contribution in the §6.1–§6.9 taxonomy.

References

Plessix, R.-E. (2006). A review of the adjoint-state method. GJI 167(2). Convergence indicators discussed throughout.
Métivier, L., Brossier, R. (2016). The SEISCOPE optimization toolbox. Geophysics 81(2), F1–F15. Reference convergence-criterion implementations.
Hanke, M. (1995). Conjugate Gradient Type Methods for Ill-Posed Problems. Pitman Research Notes. Discrepancy-principle theory.
Brossier, R. (2009). Imagerie sismique à deux dimensions des milieux viscoélastiques par inversion des formes d'onde. PhD thesis. Production-FWI workflow with diagnostics.