Dispersion-curve and joint inversion

Part 7 — Travel-time, surface-wave, and joint inversion

Learning objectives

Understand how surface-wave dispersion encodes depth-dependent V_s structure
Derive the linearised forward model: c_R(f) as a kernel-weighted depth average of V_s
Run a 1-D V_s(z) inversion from a synthetic dispersion curve end-to-end
Recognise the depth-resolution envelope: shallow well-resolved, deep less so
Sketch the joint Vp/Vs/Q inversion architecture used in modern PINN approaches

The first five sections of Part 7 dealt with body-wave first-arrival times — the eikonal regime. Surface waves obey a different physics: they propagate ALONG the free surface with frequency-dependent phase velocity, and that frequency dependence — DISPERSION — encodes the depth-dependent shear-wave velocity structure. This section closes Part 7 by showing how dispersion curves are inverted for V_s(z), and how dispersion data combine with the §7.4-§7.5 travel-time data in modern joint inversion.

The physics of dispersion

For a horizontally-stratified Earth, fundamental-mode Rayleigh waves of angular frequency $\omega = 2\pi f$ have a phase velocity $c_R(\omega)$ determined by the SECULAR EQUATION — a transcendental relation between $c_R$ , $\omega$ , and the layer parameters $(V_p, V_s, \rho, h)$ . For each frequency, the wave propagates in a "skin" of depth roughly $\lambda_R / 2 = c_R / (2f)$ : high frequencies sense shallow structure, low frequencies penetrate deep.

The full secular equation is non-linear and requires the Haskell-Thomson propagator-matrix method to evaluate. For pedagogy and in-browser computation, we use the LINEARISED form known to first-order approximation and to first principles in the asymptotic small-perturbation limit:

c_R(f) \approx \frac{\int_0^{\infty} K_R(z; f) \, V_s(z) \, dz}{\int_0^{\infty} K_R(z; f) \, dz}

where the depth-sensitivity kernel $K_R(z; f)$ has the form (in the half-space approximation around a reference Rayleigh velocity $V_R^{\mathrm{ref}}$ ):

K_R(z; f) \propto z \, \exp\!\Bigl( - \frac{2 f}{V_R^{\mathrm{ref}}} \, z \Bigr) \quad \Rightarrow \quad \text{peak at } z^* = \frac{V_R^{\mathrm{ref}}}{2f}.

Since the kernel is positive-valued and peaks at $z^*(f)$ , c_R(f) is a SMOOTH-AVERAGED V_s over the depth range probed by frequency f. The widget below shows this kernel set explicitly: low frequencies have wide kernels reaching deep; high frequencies have narrow kernels concentrated near the surface.

The inverse problem

The forward model

c_R(f_i) = \sum_z K_R(z; f_i) \cdot V_s(z) \cdot \Delta z

is LINEAR in V_s. With M observed phase velocities at frequencies ${f_i}$ and N depth nodes ${z_j}$ , this is an M × N linear system. Inversion reduces to solving

\min_{V_s} \bigl\| K \, V_s - c_{\mathrm{obs}} \bigr\|^2 + \lambda \bigl\| L \, V_s \bigr\|^2

where $L$ is a 1st-derivative matrix penalising rough V_s profiles. Damped least-squares solves the normal equations

(K^T K + \lambda L^T L) \, V_s = K^T c_{\mathrm{obs}}

via Gauss-Seidel sweeps. We iterate to handle the constraint $V_s \in [V_{\min}, V_{\max}]$ (positivity and physical bounds) by clipping after each step.

Try it: 1-D Vs(z) from synthetic dispersion

The widget runs:

True V_s(z) profile: 4-layer near-surface model: 400 m/s topsoil (0-50 m), transition layer to 950 m/s (50-120 m), mid-stiffness 1600 m/s (120-220 m), and 2100 m/s bedrock below (220-400 m). Layer interfaces are placed within the kernel-sensing depth range so the dispersion data has constraining power on each one. Realistic engineering-scale MASW (Multichannel Analysis of Surface Waves) target.
Forward modelling: 8 frequencies log-spaced over 4-40 Hz (engineering MASW band). Compute kernel-weighted c_R(f) using the formula above. These are the "observations" we will invert.
Inversion: Damped LS with λ = 1e-3 and 1st-derivative smoothness 0.5. 50 outer iterations with 5 Gauss-Seidel sweeps each. Initial guess: uniform 1.0 km/s.

Four panels: V_s(z) profile (true cyan vs recovered orange), dispersion curve (observed dots vs predicted line), the depth-sensitivity kernels K_R(z; f) (one per frequency, low-f purple to high-f yellow), and the RMS-residual trace.

Expected behaviour:

RMS dispersion residual drops from ~200 m/s (initial 1.0 km/s prior) to ~5 m/s after 50 iterations — 40× reduction.
The shallow soft layer (0-100 m) is recovered most accurately because the highest-frequency kernels concentrate there.
The deepest layer (z > 400 m) is partially undershot — the lowest-frequency kernel (4 Hz) only weakly senses below 200 m. To resolve deeper, you would need lower frequencies (1-2 Hz, common in passive ambient-noise tomography but harder to extract).
The recovered profile is SMOOTH — sharp interfaces are blurred. This is fundamental to the band-limited kernel basis: structure finer than the kernel resolution at each depth is unresolvable.

Joint inversion: combining dispersion + travel-time + microseismic

Each data type has different sensitivity to subsurface parameters:

Travel-time tomography (§7.4): sensitive to $1/c$ along ray paths. For first-arrivals these are mostly $V_p$ (compressional). Gives broad-volume $V_p$ structure.
Surface-wave dispersion (this section): sensitive to $V_s$ via the kernel above. Different depth penetration per frequency. Gives $V_s(z)$ from a single station — no source-receiver geometry needed.
Microseismic catalogues (§7.5): sensitive to absolute timing (origin) and relative location. With independent V_p and V_s ratios constrains $V_p/V_s$ via differential $T_S - T_P$ times.

JOINT INVERSION combines all three under a unified $(V_p, V_s, \rho, Q)$ model:

\mathcal{L}_{\mathrm{joint}}(\theta_{V_p}, \theta_{V_s}, \ldots) = \mathcal{L}_{\mathrm{tt}}^{V_p} + \mathcal{L}_{\mathrm{disp}}^{V_s} + \mathcal{L}_{\mathrm{micro}}^{V_p / V_s} + \mathcal{L}_{\mathrm{eikonal\,res}}^{V_p, V_s} + \mathcal{L}_{\mathrm{secular}}^{V_s} + \mathcal{L}_{\mathrm{prior}}

where each loss term is the data residual or PDE residual for the appropriate measurement and physics. Modern PINN-based joint inversions parameterise $V_p$ , $V_s$ as small MLPs, evaluate physics residuals analytically through auto-diff, and train all networks JOINTLY via a single Adam optimisation. Three concrete advantages over running three separate inversions and averaging:

Physical consistency. The joint network has only ONE $V_p(x)$ and ONE $V_s(x)$ , automatically obeying any fixed ratio constraints (Poisson-solid $V_p/V_s = \sqrt{3}$ in dry rock, lower in saturated rock).
Cross-data regularisation. Where dispersion is poorly constrained (deep layers below the lowest frequency's kernel reach), travel-time data may compensate. Where travel-times have few rays (poorly illuminated regions of the survey), dispersion data fill in.
Uncertainty quantification. With a Bayesian posterior over the joint network parameters (HMC, Stein VI, ensemble methods), you get a JOINT posterior over $(V_p, V_s)$ , properly accounting for the trade-offs between data types.

The downside: full joint PINN inversion is computationally heavy (hours-to-days on a single GPU) and requires careful loss-weight scheduling, since the data terms have very different magnitudes (travel-time residuals ~10 ms, dispersion residuals ~10 m/s, S-wave delay-times ~100 ms). Standard practice is to normalise each loss by its initial value, then track convergence on the per-component RMS rather than the total loss.

Looking ahead: Part 8 (operator learning)

Both PINN-FATT and PINN-dispersion solve a SPECIFIC PROBLEM (one velocity model, one survey geometry). To re-invert with different geometry, you re-train. Part 8 introduces OPERATOR LEARNING (DeepONet, FNO) which trains a neural operator to MAP velocity fields to travel-time fields directly — once trained, evaluation on a new velocity model is a single forward pass. For real-time monitoring (live earthquake catalogues, time-lapse reservoir surveys) this is transformative.

Operational notes

Mode separation matters. Real surface-wave records contain fundamental + higher modes. The fundamental dominates at low frequencies but at high frequencies (>20 Hz) higher modes become comparable. f-k filtering or Beamforming-based picking is essential before using the dispersion data — naive picking will pick a mix of modes and produce artefacts in the recovered V_s.
Scale matters for kernels. The kernel form used here is the small-perturbation half-space approximation. For strong contrasts (e.g., bedrock under thick soft soils, V_s ratio > 3), the linearised kernel is ~10% off and a nonlinear secular-equation update each iteration is needed for production accuracy.
Ambient-noise tomography (Shapiro-Campillo 2004) extracts the same dispersion data without active sources by cross-correlating long noise records between station pairs. The interstation Green's function emerges from the diffuse wavefield and inverts the same way the active-source dispersion data here do. This made high-resolution V_s(x, y, z) of crusts and lithospheres globally affordable.

References

Aki, K., Richards, P.G. (2002). Quantitative Seismology, 2nd ed., Univ. Science Books. Chapter 7 covers fundamental and higher-mode surface waves and the Haskell-Thomson method.
Park, C.B., Miller, R.D., Xia, J. (1999). Multichannel analysis of surface waves. Geophysics 64(3), 800–808. The MASW method that established surface-wave dispersion as a standard near-surface tool.
Shapiro, N.M., Campillo, M. (2004). Emergence of broadband Rayleigh waves from correlations of the ambient seismic noise. GRL 31, L07614. Ambient-noise tomography foundation paper.
Snieder, R., Trampert, J. (1999). Inverse problems in geophysics, in Wavefield Inversion, A. Wirgin (ed.), Springer. Linear inverse-problem theory for dispersion + travel-time inversion.
Pan, Y., Xia, J., Xu, Y., Gao, L. (2016). Multichannel analysis of Love waves in a 3D seismic acquisition system. Geophysics 81(5), EN67–EN74.

Part 7 wrap-up

Part 7 has covered the full travel-time and surface-wave PINN stack:

§7.1 Eikonal physics + Fast Sweeping Method as the classical reference solver.
§7.2 Vanilla EikoNet PINN — and its trivial-zero trap and ~5% accuracy plateau.
§7.3 Factored eikonal — order-of-magnitude accuracy improvement via T = T_0 · τ.
§7.4 Travel-time tomography (SIRT) — inverse for slowness from arrival times.
§7.5 Microseismic event location (reciprocity) — inverse for source location.
§7.6 (this section) Surface-wave dispersion inversion + joint-inversion architecture.

The reader who completes Part 7 has the toolkit to read any modern paper on PINN-based seismic-imaging or hypocentre-inversion methods. Part 8 takes the next step: instead of training one network for one specific problem, train a neural OPERATOR that maps velocity fields to travel-time fields, useful across many problems without retraining.