Multi-scale and multi-resolution networks

Part 2 — Architectures for PINNs

Learning objectives

Recognise that real-world targets often span multiple spatial scales which no single Fourier σ or SIREN ω₀ resolves cleanly
Construct a multi-scale Fourier-feature network by concatenating embeddings at several σ values
Pick the scale set (typically 3–5 log-spaced σ) for a problem with a known frequency band
Recognise hash-grid encodings (Instant-NGP) as a learnable generalisation of the multi-scale Fourier idea

Sections §2.2 and §2.3 each delivered an architectural fix to spectral bias — Fourier-feature MLPs and SIREN — and each comes with a single hyperparameter (σ or ω₀) that sets the highest representable spatial frequency. That works beautifully when the target has a single dominant scale. It works much less well when the target has features on several scales simultaneously: a slow envelope with a fast oscillation, or a multi-resolution wavefield with slowly-varying velocity and rapidly-varying amplitude.

The multi-scale problem

Consider the target $u(x) = 0.5,\sin(\pi x) + 0.5,\sin(20 \pi x)$ on $x \in [-1, 1]$ . Two scales: one cycle of slow envelope, twenty cycles of fast oscillation, factor of 20 between them. A Fourier-feature MLP with σ = 2 captures the slow envelope but misses the fast oscillation entirely. A Fourier-feature MLP with σ = 20 is the opposite: it captures the fast wiggle but the slow envelope is misrepresented because all of its features are wildly oscillatory at the relevant scale.

You cannot fix this with a single σ. The Fourier embedding is a band-pass filter centred on σ — it lets the network see one spatial frequency band cleanly, and outside that band the embedding is either too low-frequency (linear behaviour, spectral bias) or too high-frequency (uncorrelated noise).

The multi-scale fix

Compute several Fourier embeddings at different scales and concatenate them before feeding to the MLP:

\gamma_{\mathrm{multi}}(\mathbf{x}) \;=\; \big[\,\gamma_{\sigma_1}(\mathbf{x}),\ \gamma_{\sigma_2}(\mathbf{x}),\ \ldots,\ \gamma_{\sigma_K}(\mathbf{x})\,\big]

where each $\gamma_{\sigma_k}(\mathbf{x}) = [\sin(2\pi B_k \mathbf{x}), \cos(2\pi B_k \mathbf{x})]$ with $B_k \sim \mathcal{N}(0, \sigma_k^2)$ . The MLP backbone takes the concatenated vector as input. Now the network can compose features from multiple scales at every layer; spectral bias is defeated across the entire band spanned by the chosen σ values.

Try it

Three networks race on a two-scale target. The MLP backbones are identical (1-input embedding → 64 → 64 → 1, Tanh). The only difference is the embedding:

σ = 2 only (16 features): captures the slow envelope, the fast wiggle is missing.
σ = 20 only (16 features): captures the fast wiggle, the slow envelope is misrepresented.
σ = {2, 20} (32 features, concatenated): captures both. The MLP first layer is twice as wide on its input side, which is the only architectural cost.

How to pick the scale set

For a problem with a known frequency band $[f_{\min}, f_{\max}]$ :

Pick 3–5 σ values, log-spaced, spanning that band. Three is usually enough; five is generous.
The lowest σ should be ≈ $f_{\min}$ × (domain length).
The highest σ should be ≈ $f_{\max}$ × (domain length).
Use 8–16 random Fourier features per scale (so the total embedding dimension stays manageable).

For a problem with unknown frequency content (e.g., an inverse problem where the wavefield frequency depends on the velocity model being recovered), use a wider scale spread — 5–7 log-spaced σ values from 1 to 100 in normalised units — and let the MLP figure out which features matter.

Beyond Fourier: hash-grid encodings

Müller, Evans, Schied and Keller (2022) introduced "Instant Neural Graphics Primitives" (Instant-NGP), a multi-resolution hash encoding that further generalises the multi-scale Fourier idea. Instead of using a fixed random Fourier embedding, the input is hashed into multiple resolution grids, and a small learnable feature vector is stored at each grid cell. The MLP backbone reads a concatenation of features from all resolution levels. Hash-grid encodings dramatically reduce training time on complex high-frequency targets and are now widely used in 3D scene reconstruction (NeRF) and emerging in seismic-PINN work. We will see them again in Part 4 (where they help with multi-scale wavefield representation) and Part 8 (where related ideas appear in operator-learning).

What you now know

For multi-scale targets, single-scale architectures (vanilla, Fourier with one σ, SIREN with one ω₀) are not enough. Multi-scale Fourier features defeat the trade-off by combining several scales in one network. Hash-grid encodings extend the idea by making the multi-scale features learnable. With these tools in hand, the architectural side of PINN engineering is largely solved; the remaining work is in training (Part 3) and applying the architectures to specific problems (Parts 4–9).

Pause-and-check. (1) Switch to the "Three-scale signal" target. Does the σ = {2, 20} multi-scale network handle the medium-frequency component (6 cycles)? Why or why not, and what would you change about the σ set to fix it? (2) Why does the multi-scale network use the same MLP backbone width as the single-scale baselines, but a wider input? (3) For a 2D wavefield with frequencies between 10 Hz and 50 Hz on a 4 km × 4 km domain, propose a sensible σ set for a multi-scale Fourier embedding.

References

Wang, S., Wang, H., Perdikaris, P. (2021). On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks. CMAME 384, 113938.
Müller, T., Evans, A., Schied, C., Keller, A. (2022). Instant neural graphics primitives with a multiresolution hash encoding (Instant-NGP). ACM Trans. Graph. 41(4), 102.
Moseley, B., Markham, A., Nissen-Meyer, T. (2023). Finite basis physics-informed neural networks (FBPINNs). Adv. Comput. Math. 49, 62.
Liu, Z., Cai, W., Xu, Z.-Q.J. (2020). Multi-scale deep neural network (MscaleDNN) for solving PDEs. Commun. Comput. Phys. 28(5), 1970–2001.