Parametric PDE explorers in real time

Part 8 — Operator learning for seismology

Learning objectives

See the amortisation argument from §8.1 made tactile
Train a parametric DeepONet covering input-function AND operator parameters
Explore a 7-dimensional parameter space at 60 fps via sliders
Recognise where parametric operator surrogates make new workflows possible
Identify the limitations: training data coverage and parameter-space boundaries

The §8.1 amortisation argument said: pretrain ONCE on a parametric family of PDE solutions, then evaluate at any new parameter value for free. §8.5 lives that argument. We build a single DeepONet covering the FULL 7-dimensional family $(a_1, a_2, a_3, a_4, a_5, \alpha, T)$ for the heat equation, then put one slider per dimension. Every slider move triggers a millisecond-scale evaluation, and the network produces the corresponding heat-equation solution INSTANTLY.

The parametric extension to DeepONet

Standard DeepONet (§8.2) takes one input function and one query coordinate. For a parametric family the standard trick is to CONCATENATE the parameters into the branch input:

\mathcal{G}_{\mathrm{NN}}[u_0; \theta_{\mathrm{op}}](x) = \sum_{k=1}^{K} B_k\bigl(\, [u_0(\xi_1), \ldots, u_0(\xi_m), \theta_{\mathrm{op}}^{(1)}, \ldots, \theta_{\mathrm{op}}^{(p)} ]\,\bigr) \cdot T_k(x) ,

where $\theta_{\mathrm{op}}$ is a vector of the operator parameters (here $\alpha$ and $T$ ). The branch network treats them as additional dimensions of input. With reasonable parameter normalisation (we map $\alpha \in [\alpha_{\min}, \alpha_{\max}]$ and $T \in [T_{\min}, T_{\max}]$ to $[-1, 1]$ ), the branch handles the joint family without architectural change.

For our 5-mode + 2-parameter family, the branch input dimension is 7. The trunk is unchanged from §8.2 (input = query coordinate $x$ ). The basis dimension $K = 16$ is larger than §8.2's K = 8 to accommodate the higher-dimensional input space.

Try it: 7-D heat-operator playground

The widget pretrains a DeepONet on the 7-D family for 2500 epochs (~5-8 s in browser). After training:

Five sliders for the input function (a₁ through a₅) — each ranges in [-1, 1].
One slider for diffusion $\alpha$ ∈ [0.01, 0.20] — small $\alpha$ keeps high-frequency content; large $\alpha$ smooths everything quickly.
One slider for final time $T$ ∈ [0.05, 0.80] — like rolling out a learned propagator (§8.4) but with a continuous time control.

Move ANY slider. The output panel updates within milliseconds, with the DeepONet prediction (orange) overlaid on the exact analytic answer (cyan dashed). The two should track each other across the whole parameter space, not just for points in the training distribution.

What this enables (and what it cannot)

The 7-D real-time exploration is THE killer application of operator learning. Before this paradigm shift, exploring a 7-D parameter space meant running a PDE solver thousands of times — minutes per snapshot, days for the full sweep. With a pretrained operator network, the same exploration runs at 60 fps. New workflows that were unthinkable become routine:

Bayesian inference on PDE parameters. MCMC for posterior sampling needs ~100,000 forward evaluations. With FDTD that's a week of compute; with an operator surrogate, minutes. The bottleneck moves from forward solving to MCMC mixing.
Real-time monitoring. Time-lapse seismic surveys re-image the subsurface monthly. Each new survey runs through the pretrained operator instantly — no recompiling FDTD, no retraining PINNs.
Interactive design. Geophysicists can adjust a velocity model or source geometry on screen and see the corresponding wavefield in real time, supporting interactive case-study analysis.
Live UQ visualisation. Sample 1000 velocity models from the prior and propagate each through the operator network; render the ensemble in real time. Variance shows where the data is uninformative, mean shows the best prediction.

Limitations are real, however:

Out-of-distribution silently fails. If the user pushes a slider past the training range (here $\alpha > 0.2$ or $T > 0.8$ ), the network extrapolates and the answer is unreliable WITHOUT WARNING. Production systems clamp slider ranges or display an OOD warning.
Training-data coverage matters. For a 7-D space you need many training samples to cover all regions; sparse coverage means high error in under-sampled corners. Latin-hypercube sampling helps.
Parametric extrapolation degrades. Even within nominal training range, accuracy is non-uniform — corners and edges of the 7-cube have less training data than the centre. Diagnostic: re-train with more samples and observe where slider error stabilises.

Mapping to seismic applications

Real seismic operators take input function spaces with much higher dimensionality:

2-D velocity model $c(x, z)$ : 256 × 128 = 32k input pixels, plus source position (2 scalars), plus optional density and Q parameters per pixel
3-D velocity model: millions of pixels

The DeepONet branch becomes correspondingly large. To keep training tractable, production codes use:

POD-DeepONet (Lu et al 2022): pre-project the input function space onto its leading principal-component directions, train DeepONet on the low-rank coefficients.
FNO instead of DeepONet (§8.3): the spectral-convolution architecture handles full-grid inputs naturally without flattening to a giant branch input.
Geometry-Informed Neural Operator (Li et al 2023 GINO): generalises FNO to operate on irregular meshes.

The 7-D playground above is the toy model — production parametric explorers run on hundreds of dimensions and remain real-time interactive after training.

What §8.6 will do

§8.6 closes Part 8 with the HONEST cost-benefit analysis: when does operator learning beat per-instance training, and when does it lose? We compute the crossover N* explicitly, discuss training-data scaling, identify out-of-distribution failure modes, and look at hybrid approaches that combine the strengths of operators and PINNs.

References

Lu, L., Jin, P., Pang, G., Zhang, Z., Karniadakis, G.E. (2021). DeepONet — original parametric extension.
Wang, S., Wang, H., Perdikaris, P. (2023). Long-time integration of parametric evolution equations with physics-informed DeepONets. J. Comput. Phys. 475, 111855. Long-horizon parametric evolution.
Lu, L., Meng, X., Cai, S., et al. (2022). POD-DeepONet — high-dimensional input compression via POD.
Brandstetter, J., Worrall, D., Welling, M. (2022). Message Passing Neural PDE Solvers. ICLR 2022. Time-stepped graph neural operators with parametric PDE coverage.