Multi-scale failure modes in pure forward PINN

Part 5 — Forward modelling and where PINNs fall short

Learning objectives

  • Sweep the spatial frequency k of the source/IC and watch PINN's error grow
  • Confirm that FDTD's accuracy is k-independent (as long as Δx resolves the wavelength)
  • Connect this to the §0.9, §2.2, §3.1 spectral-bias diagnosis
  • Recognise that architectural fixes (Fourier features, SIREN) help but don't close the gap

§5.3 showed FDTD beating PINN on a fixed problem. This section asks: how does the gap scale with the difficulty of the problem? The answer: it widens dramatically as the wavefield gets multi-scale.

The setup

Same 1D wave problem, but with eigenmode IC u(x,0)=sin(kπx)u(x, 0) = \sin(k \pi x) for k{1,2,3,4,6}k \in {1, 2, 3, 4, 6}. Analytic solution u(x,t)=sin(kπx)cos(kπt)u(x, t) = \sin(k \pi x) \cos(k \pi t). Both temporal and spatial frequency scale linearly with kk.

  • FDTD at Nx=400N_x = 400 (well above Nyquist for all k6k \le 6, where Nyquist requires Nx22k=4kN_x \geq 2 \cdot 2k = 4k).
  • PINN at fixed 1500 epochs with vanilla 2-32-32-1 Tanh (no Fourier features).

Try it

Multi Scale FailureInteractive figure — enable JavaScript to interact.

What you should observe

  • FDTD relative-L² grows mildly with kk (numerical dispersion is O(k2Δx2)O(k^2 \Delta x^2) for 2nd-order central differences). At Nx=400N_x = 400 this means a few ×105\times 10^{-5} at k=1k = 1 rising to ~10310^{-3}10210^{-2} at k=6k = 6. Still well under 1%.
  • PINN relative-L² climbs much more steeply: ~10–20% at k=1k = 1, saturates near 100% (random-output level) for k2k \ge 2. The vanilla MLP at 1500 epochs simply cannot fit the high-frequency target — spectral bias.
  • The PINN/FDTD ratio is dominated by the PINN failure: 100× to 1000+× across the range. The spectral-bias problem is the dominant scaling story; FDTD's O(k2Δx2)O(k^2 \Delta x^2) dispersion is a footnote by comparison.

The diagnosis

This is pathology #2 from §3.1 (spectral bias) reasserting itself in seismic forward-modelling clothes. The Tancik 2020 NTK theory predicts the convergence rate of mode kk scales like the inverse of the eigenvalue of the NTK at frequency kk, which decays exponentially. So vanilla MLPs simply cannot resolve high-frequency content within any practical training budget.

Architectural fixes from Part 2 partially help:

  • Fourier features (§2.2): γ(x) = [sin(2πBx), cos(2πBx)] with random BN(0,σ2)B \sim \mathcal{N}(0, \sigma^2). Choose σ\sigma to cover the kk band. Flattens the convergence curve substantially. Still slower than FDTD by ~100×.
  • SIREN (§2.3): sin(ω0x)\sin(\omega_0 x) activations with carefully tuned ω0\omega_0. Similar effect.
  • Multi-scale Fourier (§2.5): combine multiple σ\sigma values. Best for genuinely multi-scale targets.

None of these close the cost gap with FDTD for forward modelling. They open the door to inverse problems where the network represents the unknown velocity field, not the wavefield (Part 6).

References

  • Tancik, M., et al. (2020). Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS.
  • Wang, S., Yu, X., Perdikaris, P. (2022). When and why PINNs fail to train: A neural tangent kernel perspective. JCP 449. The NTK analysis of spectral bias in PINN.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.