PINN as regulariser, initialiser, or solver

Part 9 — Hybrid PINN + classical, with uncertainty

Learning objectives

Distinguish the three production roles a PINN can play in inverse-problem workflows
Identify which role suits a given data-availability / problem-conditioning regime
Race the three roles head-to-head on a 2-layer velocity inversion
Recognise that a smart starting model often beats a smart loss term
Set up §9.2 PINN-augmented FWI, §9.3 learned regularisers, §9.5 Bayesian PINNs

The first eight Parts treated PINNs as STAND-ALONE solvers for forward and inverse problems. Part 9 takes a different view: in production seismic-imaging pipelines, a PINN is rarely the whole story. It sits inside a larger workflow that also includes classical FWI, hand-crafted regularisers, ray tracing, conventional optimisation, and perhaps a final QC step. Within that workflow, a PINN can play THREE roles. Picking the right role for the right regime is the central design question of §9.x.

Role 1 — Solver

The PINN replaces the classical solver entirely. Examples: §7.2 EikoNet PINN solves the eikonal in place of FSM; §7.3 factored EikoNet does the same with order-of-magnitude better accuracy. The PINN OUTPUT is the answer; classical methods are not used.

When this role wins: when the problem space is well-defined and the PINN reaches production accuracy. The factored eikonal at 0.9% relative error in §7.3 is the canonical example. Inference is a single forward pass — fast and differentiable.

When it loses: when the PINN cannot match classical-solver accuracy across the operating envelope. Vanilla EikoNet (§7.2) at 5% relative error is borderline; for 1% production tolerance, classical FSM is safer.

Role 2 — Initialiser

The PINN provides a STARTING MODEL for a downstream classical solver. The PINN output is fed in as a warm start; classical FWI then refines. The PINN does the heavy lifting of escaping bad initial conditions; the classical solver handles the precision endgame.

When this role wins: when classical FWI suffers from local-minimum / cycle-skipping pathologies (§6.5) and a half-decent starting model would unlock convergence. The PINN provides exactly that: a smooth, low-frequency initial guess that captures the gross subsurface structure. Classical FWI then sharpens it. This is the dominant production pattern for hybrid workflows in 2024-2026 industry codes.

When it loses: when the PINN initial guess is itself outside the convergence basin of the classical solver, or when classical FWI is fast and well-conditioned enough that any reasonable starting model would converge.

Role 3 — Regulariser

The PINN-residual is added as a PENALTY TERM to the classical FWI loss:

\mathcal{L}_{\mathrm{hybrid}}(\theta) = \underbrace{\sum_{s, r} (T_{\mathrm{pred}} - t_{\mathrm{obs}})^2}_{\text{classical data misfit}} + \lambda \, \underbrace{\frac{1}{N_c}\sum_k (|\nabla T_{\mathrm{NN}}|^2 - 1/c^2)^2}_{\text{PINN-residual penalty}} .

The PINN output is NOT used directly; only the physics constraint encoded in its residual is. This converts the PINN into a SOFT prior: configurations that violate the eikonal cost more than configurations that obey it.

When this role wins: when training data is sparse and the inverse problem is ill-conditioned. The PINN-residual provides extra physical constraints that pull the solution toward feasible velocity models even where data does not. Especially valuable in early FWI cycles before phase coherence is established.

When it loses: when the data-misfit term is well-constrained on its own and the PINN-residual just slows convergence. Also when $\lambda$ is hard to choose — too small means no effect, too large biases away from data fit.

Try it: race the three roles

The widget runs three 50-iteration gradient-descent inversions on a 2-layer model (true: $z_{\mathrm{int}} = 0.5$ , $v_1 = 1.5$ , $v_2 = 3.0$ ). Each method tackles the same observed data — 8 receivers at the bottom, 1 source at top — but starts from a different place or uses a different loss:

Classical (red). Bad initial guess $(0.3, 2.0, 2.0)$ . Pure data-misfit gradient. The "do nothing fancy" baseline.
Regularised (blue). Same bad initial guess. Loss = data misfit + Tikhonov-on- $(v_2 - v_1)^2$ — a soft "smoothness" prior. Stand-in for a PINN-residual regulariser.
Initialised (green). Smart initial guess $(0.45, 1.6, 2.8)$ — close to truth, as if a pretrained PINN had pre-cooked it. Pure data misfit from there.

Three panels: convergence trace (data misfit vs iteration, log y), final recovered parameters bar chart (vs truth in yellow), and recovered velocity profile $c(z)$ overlaid on truth.

Expected outcome: Initialised (green) wins decisively — typically 4-5× lower parameter L² error than the bad-init runs. Classical (red) and Regularised (blue, weak Tikhonov λ=0.05) both stall at parameter error ~1.1, fitting the data well at the wrong velocities — the data misfit alone cannot distinguish the local-minimum trap from truth without a stronger prior or a better starting model. This is the central pedagogical point: a good initial model unlocks convergence in ways a weak regulariser cannot. §9.2 will show that a STRONGER hand-crafted prior (λ=1, depth-trend penalty) DOES rescue the bad init — but only after we choose the right prior shape and weight.

The decision matrix

Which role should a given problem use? The answer depends on three orthogonal questions:

Q1: How accurate does the answer need to be? If 5-10% relative error is acceptable, PINN-as-solver is often fine. For sub-1% tolerances, you generally need classical refinement after a PINN warm start.
Q2: How well-conditioned is the inverse problem? If data well-constrains the solution, classical FWI alone may suffice. If data is sparse / ill-conditioned (early FWI cycles, low signal-to-noise), regularisation is essential — and PINN-residual is one of the most physically-grounded regularisers available.
Q3: How expensive is each forward solve? If forward solves cost minutes (3D FWI on Marmousi-class), every iteration is precious; a PINN initialiser that saves 50 iterations is hugely valuable. If forward solves cost milliseconds (1D toy problems), the initialiser saves nothing meaningful.

For most production 2-D / 3-D FWI workflows on real data, the answer to all three questions points to HYBRID approaches that combine roles: train a PINN to provide an initial model AND add its residual as a regulariser AND use classical FWI for the precision endgame. §9.2 walks through this pattern in concrete detail.

What the rest of Part 9 will do

§9.2 PINN-augmented classical FWI. Concrete recipe: classical FWI gradient + PINN-residual regulariser, with explicit weight scheduling and convergence monitoring. The "regulariser" role made operational.
§9.3 Learned regularisers and ML priors. Generalises beyond PINN-residual to LEARNED priors (e.g., autoencoder reconstruction loss, denoising-diffusion guidance). The "regulariser" role becomes data-driven rather than physics-driven.
§9.4 Generative priors for velocity models. The state-of-the-art: train a generative model (VAE, diffusion) on a corpus of plausible velocity models, then use its likelihood as the regulariser. Powerful but data-hungry.
§9.5 Bayesian PINNs and ensemble PINNs. Quantify uncertainty in the PINN itself: weight-space Bayesian inference, dropout-as-Bayes, ensemble disagreement. Foundation for §9.6.
§9.6 Uncertainty-aware velocity inversion. Capstone: produce a velocity model AND an uncertainty map. The output a production seismic-interpretation team actually needs.

References

Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L. (2021). Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440. The review that established the PINN-in-production framing.
Cuomo, S., Di Cola, V.S., Giampaolo, F., Rozza, G., Raissi, M., Piccialli, F. (2022). Scientific machine learning through physics-informed neural networks: where we are and what's next. J. Sci. Comput. 92(3), 88. Comprehensive review with explicit role taxonomy.
Sun, J., Innanen, K.A., Huang, C. (2021). Physics-guided deep learning for seismic inversion with hybrid training and uncertainty analysis. Geophysics 86(3), R303–R317. PINN-augmented FWI applied to real data.
Smith, J.D., Ross, Z.E., Azizzadenesheli, K., Muir, J.B. (2022). HypoSVI: Hypocentre inversion with Stein variational inference and physics informed neural networks. Geophys. J. Int. 228, 698–710. Production hybrid workflow with UQ.