Learned propagators for fast forward modelling

Part 8 — Operator learning for seismology

Learning objectives

  • Recognise the time-stepping operator as the natural target for FNO in seismology
  • Train a single-step propagator on the 1-D wave equation
  • Roll the learned propagator out for many timesteps and observe error growth
  • Distinguish stable from drifting learned propagators
  • Map the 1-D demo onto 2D/3D production wavefield surrogates

The wave-propagation problem in seismology is a TIME-STEPPING problem. FDTD takes a wavefield at time tt and applies a fixed update stencil to get the wavefield at t+Δtt + \Delta t. Run this update for NN timesteps to get a full simulation of duration NΔtN \Delta t. For 2D production-grade FDTD on Marmousi-class velocity models with thousands of shots, this is the dominant compute cost in any seismic-imaging or FWI pipeline.

Operator learning offers an alternative: TRAIN a network PNN:unun+1\mathcal{P}_{\mathrm{NN}}: u^n \mapsto u^{n+1} that learns the time-stepping operator from FDTD-generated examples. Inference is one forward pass per timestep — typically 101000×10\text{–}1000\times faster than FDTD per step on GPU. The amortised cost is dominated by training; once trained, the propagator can be applied to any new initial condition or velocity model within the training distribution at near-zero per-step cost.

Architecture choice: FNO is natural

For wavefield time-stepping, FNO (§8.3) is the natural architecture:

  • Wavefields live on REGULAR GRIDS — exactly FNO's natural domain.
  • The wave-propagation operator is approximately TRANSLATION-INVARIANT (in homogeneous medium it is exactly translation-invariant; in heterogeneous medium the local stencil is the same shape but with locally-varying coefficients).
  • Resolution-invariance means train at one grid spacing, evaluate at another — useful when survey grids change between projects.
  • The time-stepping operator naturally has a SECOND-ORDER recursion: un+1u^{n+1} depends on unu^n AND un1u^{n-1} (because the wave equation is second-order in time). FNO's multi-channel architecture handles this trivially.

For DeepONet, time-stepping is awkward — the trunk would need to take (x,n)(x, n) inputs and the basis dimension would need to be huge to span all time evolution. FNO is structurally a better fit.

The 1-D wave-equation case (analytic ground truth)

For the homogeneous-medium 1-D wave equation 2u/t2=c22u/x2\partial^2 u / \partial t^2 = c^2 \partial^2 u / \partial x^2 with Dirichlet BCs on [0,1][0, 1], the EXACT propagator is diagonal in the sine basis:

u^k(t+Δt)=2cos(ckπΔt)u^k(t)u^k(tΔt).\hat{u}_k(t + \Delta t) = 2 \cos(c k \pi \Delta t) \cdot \hat{u}_k(t) - \hat{u}_k(t - \Delta t) .

So a 1-layer FNO with two-channel input (un,un1)(u^n, u^{n-1}) and learnable coefficients αk,βk\alpha_k, \beta_k per Fourier mode should converge to

αk2cos(ckπΔt),βk1.\alpha_k \to 2 \cos(c k \pi \Delta t), \quad \beta_k \to -1 .

This is the cleanest possible test of "can the propagator architecture express what the physics requires". If the FNO can match the analytic propagator, we have a proper time-stepping surrogate; if not, we have a problem.

Try it: train + roll out 200 steps

Learned PropagatorInteractive figure — enable JavaScript to interact.

The widget:

  • Trains the 32 propagator coefficients (αk,βk\alpha_k, \beta_k for k=1..16k = 1..16) on random 5-mode initial conditions × ~25 timesteps each, with the analytic snapshot at each time as the training target. ~1-2 s on a modern laptop.
  • Rolls out the learned propagator from a fixed test IC for 200 timesteps, comparing against the analytic solution at each step.
  • Reports the rollout RMS error trajectory — the diagnostic for stability under composition.

Four panels:

  • Initial wavefield u(x,0)u(x, 0).
  • Wavefield at t=200Δtt = 200 \Delta t: exact (cyan dashed) vs learned-propagator rollout (orange).
  • Learned coefficients αk\alpha_k (orange) and βk\beta_k (purple) compared with exact values (cyan dots overlay).
  • Per-step rollout RMS error on log-y. A horizontal line means STABLE; a rising line means DRIFT.

Expected behaviour: because the 1-D wave operator is exactly diagonal in sine basis, the FNO converges to the exact coefficients to machine precision. The rollout error stays at ~1e-6 (float32 noise) for the entire 200 steps — completely stable, no drift. This is the IDEAL CASE for learned propagators.

Why drift happens — and why this case avoids it

The general worry with iterative neural propagators is that small per-step errors COMPOUND. If the learned operator PNN\mathcal{P}_{\mathrm{NN}} has spectral radius slightly greater than the true operator P\mathcal{P}, then after NN steps the error grows like

PNNNu0PNu0NPNNPu0(linear regime)\|\mathcal{P}_{\mathrm{NN}}^N u_0 - \mathcal{P}^N u_0\| \sim N \cdot \|\mathcal{P}_{\mathrm{NN}} - \mathcal{P}\| \cdot \|u_0\| \quad \text{(linear regime)}

or worse exponentially in NN if the spectral radius of PNN\mathcal{P}_{\mathrm{NN}} exceeds 1 even slightly. Production learned propagators (Lehmann et al 2024 F-FNO for 3D elastic waves; Geneva-Zabaras 2022 implicit neural operators) deal with this via three techniques:

  • Multi-step rollout training. Train on 5-50-step rollouts rather than single steps, so the loss penalises compounding error directly. Costly per epoch but dramatically improves long-horizon stability.
  • Spectral-radius regularisation. Add a penalty λmaxkK^k\lambda , \max_k |\hat{K}_k| to keep the learned operator non-amplifying. Soft analogue of the CFL stability condition.
  • Implicit time stepping. Rather than learning un+1=PNNunu^{n+1} = \mathcal{P}{\mathrm{NN}} u^n, learn an INCREMENT un+1=un+ΔtFNN(un)u^{n+1} = u^n + \Delta t \cdot \mathcal{F}{\mathrm{NN}}(u^n) where FNN\mathcal{F}{\mathrm{NN}} is small. Gradient with respect to unu^n is dominated by the identity, so even bad FNN\mathcal{F}{\mathrm{NN}} gives stable rollouts. Equivalent to neural-ODE perspectives.

Our 1-D demo dodges drift because the analytic operator IS exactly representable in the chosen architecture. For 2D heterogeneous-medium operators with multi-layer FNO, drift is real and these techniques matter.

Mapping to 2D/3D production

Industry deployment of learned propagators (Caltech, NVIDIA, Saudi Aramco) uses 2D/3D FNOs with multiple layers (~4-8) and channels (~32-128). Training data: thousands of FDTD simulations on diverse velocity models. Compute budget: 1-7 days on multi-GPU rigs. Once trained:

  • 2D wavefield evaluation: 10\sim 10 ms per timestep on a single GPU (vs FDTD's 100\sim 100 ms-1 s)
  • Backward AD through propagator: cheap, useful for FWI where wave-equation propagator is the inner loop
  • Out-of-distribution warning: if the test velocity model has structure unlike training data, prediction fails silently. Always cross-check with FDTD on at least one test case.

Lehmann et al 2024 demonstrated 3D elastic wave propagation with the F-FNO (Factorised FNO) running ~50× faster than spectral-element FDTD with <5%<5% error in body-wave amplitudes. This level of speedup makes Bayesian FWI workflows tractable that were previously cost-prohibitive.

What §8.5 will do

§8.5 wraps the operator-learning machinery into an INTERACTIVE PARAMETER EXPLORER. Users adjust velocity-model parameters, source positions, or boundary geometry on sliders and watch the operator network produce wavefields/travel-times in real time — at 60 fps. The amortisation argument from §8.1 made tactile.

References

  • Lehmann, F., Gatti, F., Bertin, M., Clouteau, D. (2024). 3D elastic wave propagation with a Factorized Fourier Neural Operator (F-FNO). Comput. Methods Appl. Mech. Eng. 420, 116718. Production 3D elastic-wave learned propagator.
  • Geneva, N., Zabaras, N. (2022). Transformers for modeling physical systems. Neural Networks 146, 272–289. Implicit time-stepping with neural operators.
  • Alkhalifah, T., Song, C., bin Waheed, U., Hao, Q. (2021). Wavefield solutions from machine learned functions constrained by the Helmholtz equation. Artificial Intelligence in Geosciences 2, 11–19. PINN-based Helmholtz wavefield modelling that informed learned-propagator practice.
  • Li, Z., Liu-Schiaffini, M., Kovachki, N., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., Anandkumar, A. (2022). Learning chaotic dynamics in dissipative systems. NeurIPS 2022. Markovian neural operators for stable long-horizon rollouts.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.