Adaptive collocation point sampling (RAR/RAD)

Part 3, Training pathologies and remedies

Learning objectives

Recognise the sampling-bias pathology: uniform colloc points waste capacity on easy regions
Implement RAR (Lu et al. 2021): periodically add points where residual is largest
Implement RAD (Wu et al. 2023): resample collocation points proportional to residual
Choose between adding points (RAR) and reweighting points (RAD) for a given problem

The PDE residual is enforced at collocation points. Uniform sampling spreads them equally across the domain, a sensible default until you remember that the true residual is rarely uniform. Most PINN problems have a few regions of difficulty (a shock, a wavefront, a localised source) and large regions where the network already fits. Uniform colloc points spend the network's capacity on already-easy regions. Pathology #5 from §3.1.

Two adaptive remedies

(1) RAR, Residual-based Adaptive Refinement (Lu, Meng, Mao & Karniadakis, DeepXDE 2021). Every $K$ epochs, evaluate the residual on a dense grid; add the top- $N$ points to the collocation set. Points are never removed; the collocation set grows over training.

(2) RAD, Residual-based Adaptive Distribution (Wu, Zhu, Tan, Kartha & Lu 2023). Every $K$ epochs, resample the entire collocation set with probability proportional to $|r(x)|^p$ on a dense grid. Set count stays fixed at $N_c$ ; locations adapt.

Both schemes do the same thing in spirit: focus colloc points where the residual is large. RAR is simpler and never decreases sample density; RAD has bounded compute cost per epoch.

Try it: 1D Poisson with a sharp source

The widget solves $u''(x) = 200 \exp(-200 (x - 0.3)^2)$ on $x \in [-1, 1]$ with $u(\pm 1) = 0$ . The right-hand side is a Gaussian bump at $x = 0.3$ , producing a sharp peak in $u''$ near $x = 0.3$ . Three strategies race: uniform 20 points (deliberately too few to resolve the bump cleanly); RAR (start 20, add 2 every 100 ep from a 200-point dense grid); RAD (every 100 ep, keep 14 uniform + resample 6 with probability $\propto r^2$ ). Both adaptive schemes mix uniform + weighted samples, the de-facto best practice from Wu et al. 2023 that prevents catastrophic over-concentration.

What you should observe

uniform: the central peak is poorly resolved, the centre residual is much larger than the outer residual. This is the §3.1 sampling-bias pathology.
RAR: starts uniform, then adapts. The colloc-point density panel shows visible clustering at $x = 0.3$ (the bump location) by the end of training. Final relative-L² typically drops by 5-10×.
RAD: similar improvement to RAR. The collocation set has fixed cardinality but moves toward the bump. Stochastic noise from the resampling step makes the loss curve less smooth than RAR's but the final fit is similar.

RAR vs RAD: which to choose

RAR is preferred when the hard regions are localised: a small number of additions concentrate compute where it matters. Memory grows linearly with training time.
RAD is preferred when the hard region moves during training: an evolving wavefront, an inverse problem with shifting model parameters. Bounded memory.
Both are typically used together with a base-rate uniform sampling (~70% uniform + 30% adaptive) to avoid catastrophic over-concentration.

Adaptive sampling for seismic FWI

Wu, Karniadakis et al. (2023) benchmarked 10+ adaptive sampling schemes on a suite of PINN problems. For seismic-style problems with a localised source, the RAR variant with $N_{\textrm{add}} \sim 0.1 N_c$ per refresh is the recommended default. Modern seismic PINN papers (Rasht-Behesht 2022, Song 2023) all use some form of adaptive sampling, often combined with the source-region NTK weighting from §3.3.

References

Lu, L., Meng, X., Mao, Z., Karniadakis, G.E. (2021). DeepXDE: A deep learning library for solving differential equations. SIAM Review 63(1), 208-228.
Wu, C., Zhu, M., Tan, Q., Kartha, Y., Lu, L. (2023). A comprehensive study of non-adaptive and residual-based adaptive sampling methods for physics-informed neural networks. CMAME.
Daw, A., Bu, J., Wang, S., Perdikaris, P., Karpatne, A. (2023). Mitigating propagation failures in physics-informed neural networks using retain-resample-release (R3) sampling. ICML.