Multi-scale failure modes in pure forward PINN
Learning objectives
- Sweep the spatial frequency k of the source/IC and watch PINN's error grow
- Confirm that FDTD's accuracy is k-independent (as long as Δx resolves the wavelength)
- Connect this to the §0.9, §2.2, §3.1 spectral-bias diagnosis
- Recognise that architectural fixes (Fourier features, SIREN) help but don't close the gap
§5.3 showed FDTD beating PINN on a fixed problem. This section asks: how does the gap scale with the difficulty of the problem? The answer: it widens dramatically as the wavefield gets multi-scale.
The setup
Same 1D wave problem, but with eigenmode IC for . Analytic solution . Both temporal and spatial frequency scale linearly with .
- FDTD at (well above Nyquist for all , where Nyquist requires ).
- PINN at fixed 1500 epochs with vanilla 2-32-32-1 Tanh (no Fourier features).
Try it
What you should observe
- FDTD relative-L² grows mildly with (numerical dispersion is for 2nd-order central differences). At this means a few at rising to ~– at . Still well under 1%.
- PINN relative-L² climbs much more steeply: ~10–20% at , saturates near 100% (random-output level) for . The vanilla MLP at 1500 epochs simply cannot fit the high-frequency target — spectral bias.
- The PINN/FDTD ratio is dominated by the PINN failure: 100× to 1000+× across the range. The spectral-bias problem is the dominant scaling story; FDTD's dispersion is a footnote by comparison.
The diagnosis
This is pathology #2 from §3.1 (spectral bias) reasserting itself in seismic forward-modelling clothes. The Tancik 2020 NTK theory predicts the convergence rate of mode scales like the inverse of the eigenvalue of the NTK at frequency , which decays exponentially. So vanilla MLPs simply cannot resolve high-frequency content within any practical training budget.
Architectural fixes from Part 2 partially help:
- Fourier features (§2.2): γ(x) = [sin(2πBx), cos(2πBx)] with random . Choose to cover the band. Flattens the convergence curve substantially. Still slower than FDTD by ~100×.
- SIREN (§2.3): activations with carefully tuned . Similar effect.
- Multi-scale Fourier (§2.5): combine multiple values. Best for genuinely multi-scale targets.
None of these close the cost gap with FDTD for forward modelling. They open the door to inverse problems where the network represents the unknown velocity field, not the wavefield (Part 6).
References
- Tancik, M., et al. (2020). Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS.
- Wang, S., Yu, X., Perdikaris, P. (2022). When and why PINNs fail to train: A neural tangent kernel perspective. JCP 449. The NTK analysis of spectral bias in PINN.