Multi-scale and multi-resolution networks
Learning objectives
- Recognise that real-world targets often span multiple spatial scales which no single Fourier σ or SIREN ω₀ resolves cleanly
- Construct a multi-scale Fourier-feature network by concatenating embeddings at several σ values
- Pick the scale set (typically 3–5 log-spaced σ) for a problem with a known frequency band
- Recognise hash-grid encodings (Instant-NGP) as a learnable generalisation of the multi-scale Fourier idea
Sections §2.2 and §2.3 each delivered an architectural fix to spectral bias — Fourier-feature MLPs and SIREN — and each comes with a single hyperparameter (σ or ω₀) that sets the highest representable spatial frequency. That works beautifully when the target has a single dominant scale. It works much less well when the target has features on several scales simultaneously: a slow envelope with a fast oscillation, or a multi-resolution wavefield with slowly-varying velocity and rapidly-varying amplitude.
The multi-scale problem
Consider the target on . Two scales: one cycle of slow envelope, twenty cycles of fast oscillation, factor of 20 between them. A Fourier-feature MLP with σ = 2 captures the slow envelope but misses the fast oscillation entirely. A Fourier-feature MLP with σ = 20 is the opposite: it captures the fast wiggle but the slow envelope is misrepresented because all of its features are wildly oscillatory at the relevant scale.
You cannot fix this with a single σ. The Fourier embedding is a band-pass filter centred on σ — it lets the network see one spatial frequency band cleanly, and outside that band the embedding is either too low-frequency (linear behaviour, spectral bias) or too high-frequency (uncorrelated noise).
The multi-scale fix
Compute several Fourier embeddings at different scales and concatenate them before feeding to the MLP:
where each with . The MLP backbone takes the concatenated vector as input. Now the network can compose features from multiple scales at every layer; spectral bias is defeated across the entire band spanned by the chosen σ values.
Try it
Three networks race on a two-scale target. The MLP backbones are identical (1-input embedding → 64 → 64 → 1, Tanh). The only difference is the embedding:
- σ = 2 only (16 features): captures the slow envelope, the fast wiggle is missing.
- σ = 20 only (16 features): captures the fast wiggle, the slow envelope is misrepresented.
- σ = {2, 20} (32 features, concatenated): captures both. The MLP first layer is twice as wide on its input side, which is the only architectural cost.
How to pick the scale set
For a problem with a known frequency band :
- Pick 3–5 σ values, log-spaced, spanning that band. Three is usually enough; five is generous.
- The lowest σ should be ≈ × (domain length).
- The highest σ should be ≈ × (domain length).
- Use 8–16 random Fourier features per scale (so the total embedding dimension stays manageable).
For a problem with unknown frequency content (e.g., an inverse problem where the wavefield frequency depends on the velocity model being recovered), use a wider scale spread — 5–7 log-spaced σ values from 1 to 100 in normalised units — and let the MLP figure out which features matter.
Beyond Fourier: hash-grid encodings
Müller, Evans, Schied and Keller (2022) introduced "Instant Neural Graphics Primitives" (Instant-NGP), a multi-resolution hash encoding that further generalises the multi-scale Fourier idea. Instead of using a fixed random Fourier embedding, the input is hashed into multiple resolution grids, and a small learnable feature vector is stored at each grid cell. The MLP backbone reads a concatenation of features from all resolution levels. Hash-grid encodings dramatically reduce training time on complex high-frequency targets and are now widely used in 3D scene reconstruction (NeRF) and emerging in seismic-PINN work. We will see them again in Part 4 (where they help with multi-scale wavefield representation) and Part 8 (where related ideas appear in operator-learning).
What you now know
For multi-scale targets, single-scale architectures (vanilla, Fourier with one σ, SIREN with one ω₀) are not enough. Multi-scale Fourier features defeat the trade-off by combining several scales in one network. Hash-grid encodings extend the idea by making the multi-scale features learnable. With these tools in hand, the architectural side of PINN engineering is largely solved; the remaining work is in training (Part 3) and applying the architectures to specific problems (Parts 4–9).
Pause-and-check. (1) Switch to the "Three-scale signal" target. Does the σ = {2, 20} multi-scale network handle the medium-frequency component (6 cycles)? Why or why not, and what would you change about the σ set to fix it? (2) Why does the multi-scale network use the same MLP backbone width as the single-scale baselines, but a wider input? (3) For a 2D wavefield with frequencies between 10 Hz and 50 Hz on a 4 km × 4 km domain, propose a sensible σ set for a multi-scale Fourier embedding.
References
- Wang, S., Wang, H., Perdikaris, P. (2021). On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks. CMAME 384, 113938.
- Müller, T., Evans, A., Schied, C., Keller, A. (2022). Instant neural graphics primitives with a multiresolution hash encoding (Instant-NGP). ACM Trans. Graph. 41(4), 102.
- Moseley, B., Markham, A., Nissen-Meyer, T. (2023). Finite basis physics-informed neural networks (FBPINNs). Adv. Comput. Math. 49, 62.
- Liu, Z., Cai, W., Xu, Z.-Q.J. (2020). Multi-scale deep neural network (MscaleDNN) for solving PDEs. Commun. Comput. Phys. 28(5), 1970–2001.