Domain decomposition: XPINN, cPINN, FBPINN

Part 3 — Training pathologies and remedies

Learning objectives

Recognise that single-network PINNs do not scale to large or complex domains
Survey the three families: cPINN, XPINN, FBPINN
Implement an FBPINN with cosine-window partition of unity
Confirm empirically that FBPINN beats a same-parameter single MLP on a multi-feature target

A fixed-capacity MLP cannot resolve features at all scales simultaneously. As the domain grows or the solution gets more features, single-network PINNs underfit. Three decomposition families emerged to scale PINN to large or complex problems:

The three families

cPINN (Jagtap, Kawaguchi, Karniadakis 2020). Split the domain into non-overlapping subdomains; one network per subdomain; conservative coupling along the interfaces enforces continuity of the solution and the flux. Best for hyperbolic conservation laws (Euler, shallow water).
XPINN (Jagtap, Karniadakis 2020). Generalisation of cPINN to arbitrary subdomain decomposition with continuity-loss interface coupling (no flux constraint). Best for parabolic and elliptic problems where conservation is not the primary issue.
FBPINN (Moseley, Markham, Nissen-Meyer 2023). Finite-Basis PINN. Use overlapping subdomains with a smooth partition-of-unity weighting $w_k(x)$ that satisfies $\sum_k w_k(x) = 1$ for all $x$ . The global solution is $u(x) = \sum_k w_k(x) M_k(x)$ . No interface loss is needed — the windowing makes the global solution smooth by construction. Best for problems with multiple localised features.

FBPINN is the most modern and the most elegant. The interface loss in cPINN/XPINN is a third loss term that introduces its own loss-balance crisis (§3.2–§3.3). FBPINN avoids this by encoding the smoothness in the architecture.

The FBPINN ansatz

For a 1D domain $[a, b]$ split into $K$ overlapping windows centred at $c_k$ with half-widths $h_k$ , define the cosine-square window

\rho_k(x) = \begin{cases} \cos^2\!\left(\tfrac{\pi}{2} \cdot \tfrac{x - c_k}{h_k}\right) & |x - c_k| < h_k \\ 0 & \textrm{otherwise} \end{cases}

and the partition-of-unity weights $w_k(x) = \rho_k(x) / \sum_j \rho_j(x)$ . Each subnet $M_k(x)$ is a small MLP. The global solution is

u(x) = \sum_{k=1}^K w_k(x) M_k(x) .

For a PDE $u''(x) = f(x)$ we need $u''$ : by the chain rule

u''(x) = \sum_k \big[ w_k''(x) M_k(x) + 2 w_k'(x) M_k'(x) + w_k(x) M_k''(x) \big] .

The window derivatives $w_k'$ and $w_k''$ are computed analytically (we know $\rho_k$ in closed form); each subnet provides $M_k'$ and $M_k''$ via the same forward-mode AD machinery from Part 0. The PDE-residual gradient flows back into every subnet whose window covers $x$ — just three or four subnets at any point thanks to the windowing.

Try it: FBPINN vs single MLP

The widget solves $u''(x) = f(x)$ with five sharp Gaussian-bump source terms at $x = -0.7, -0.35, 0, 0.35, 0.7$ . A single 1-32-32-1 MLP races a 6-window FBPINN where each subnet is a 1-16-16-1 MLP (FBPINN has more total parameters but the per-subnet capacity is small — each window covers ~one bump). Watch the per-window weights $w_k(x)$ in the right panel and the final $u(x)$ fit on the left.

What you should observe

Single MLP (1-32-32-1, ~1.1k params): given enough epochs, can actually fit this 1D problem cleanly — relative-L² typically a fraction of a percent. The single MLP is a strong baseline on 1D problems.
FBPINN (6 × 1-16-16-1, ~3k total params, but only ~500 active per point): each subnet specialises on its window; the partition-of-unity enforces a smooth global solution. Relative-L² typically a few percent — competitive but not dramatically better on this 1D toy.
The window panel shows the six cosine-square partition functions: $w_1$ is large near $x = -1$ , $w_6$ near $x = 1$ , with smooth blending in between. They sum to 1 everywhere.

Important honest note. On this small 1D toy a 32-32 single MLP can already fit five sharp bumps cleanly. The FBPINN advantage emerges in 2D and 3D, where a single network of comparable size cannot resolve all features simultaneously. The widget shows the mechanism — partition-of-unity windowing, analytic window derivatives, per-subnet PDE-residual flow — in the smallest setting where it can be implemented. The pedagogical purpose is the construction; the empirical advantage is unlocked by scale.

Why FBPINN scales

The killer feature of FBPINN is parallelism. Each subnet trains independently except for the partition-of-unity coupling at boundaries. On a 100-window decomposition, each subnet can run on its own GPU (Moseley et al. 2023 demonstrate up to 1024-window decompositions on multi-GPU clusters). The decomposition is the only PINN technique that genuinely scales to billion-parameter PDE problems — which is roughly the cost of a 3D acoustic FWI on a Sleipner-class survey.

Choosing windows

Number $K$ : enough that each window resolves a single dominant feature. For a multi-bump 1D problem, $K$ = number of bumps + a few extras.
Half-width $h$ : large enough that adjacent windows overlap on at least one collocation point. Smaller $h$ means more specialisation per subnet but more total subnets.
Subnet capacity: small networks are fine; the windowing is the architectural lift, not the subnet width.

Expertise checkpoint — end of Part 3

You should now be able to:

Diagnose a stalled PINN training by reading the per-term loss trace, identifying the named pathology, and reaching for the correct remedy.
Implement loss-weight tuning, NTK-balanced weighting, gradient-norm balancing, SA-PINN, causality weighting, frequency continuation, RAR/RAD, and FBPINN — at least on toy problems.
Decide which remedies a real seismic-PINN problem needs (typically NTK + causality + RAR + curriculum together).
Defend why FBPINN is the recommended decomposition for large 2D and 3D wave equations.
Critique training-failure complaints in any current PINN paper using the language of this Part.

Part 4 sets up the wave equations in PINN form. Many of the tools from this Part will be visible in every demo there.

References

Jagtap, A.D., Kawaguchi, K., Karniadakis, G.E. (2020). Conservative physics-informed neural networks on discrete domains for conservation laws. CMAME 365, 113028.
Jagtap, A.D., Karniadakis, G.E. (2020). Extended physics-informed neural networks (XPINNs). CICP 28(5), 2002–2041.
Moseley, B., Markham, A., Nissen-Meyer, T. (2023). Finite basis physics-informed neural networks (FBPINNs). Adv. Comput. Math. 49, 62.
Dolean, V., Heinlein, A., Mishra, S., Moseley, B. (2024). Multi-level FBPINNs for high-frequency multi-scale problems. JCP.