Dispersion-curve and joint inversion
Learning objectives
- Understand how surface-wave dispersion encodes depth-dependent V_s structure
- Derive the linearised forward model: c_R(f) as a kernel-weighted depth average of V_s
- Run a 1-D V_s(z) inversion from a synthetic dispersion curve end-to-end
- Recognise the depth-resolution envelope: shallow well-resolved, deep less so
- Sketch the joint Vp/Vs/Q inversion architecture used in modern PINN approaches
The first five sections of Part 7 dealt with body-wave first-arrival times — the eikonal regime. Surface waves obey a different physics: they propagate ALONG the free surface with frequency-dependent phase velocity, and that frequency dependence — DISPERSION — encodes the depth-dependent shear-wave velocity structure. This section closes Part 7 by showing how dispersion curves are inverted for V_s(z), and how dispersion data combine with the §7.4-§7.5 travel-time data in modern joint inversion.
The physics of dispersion
For a horizontally-stratified Earth, fundamental-mode Rayleigh waves of angular frequency have a phase velocity determined by the SECULAR EQUATION — a transcendental relation between , , and the layer parameters . For each frequency, the wave propagates in a "skin" of depth roughly : high frequencies sense shallow structure, low frequencies penetrate deep.
The full secular equation is non-linear and requires the Haskell-Thomson propagator-matrix method to evaluate. For pedagogy and in-browser computation, we use the LINEARISED form known to first-order approximation and to first principles in the asymptotic small-perturbation limit:
where the depth-sensitivity kernel has the form (in the half-space approximation around a reference Rayleigh velocity ):
Since the kernel is positive-valued and peaks at , c_R(f) is a SMOOTH-AVERAGED V_s over the depth range probed by frequency f. The widget below shows this kernel set explicitly: low frequencies have wide kernels reaching deep; high frequencies have narrow kernels concentrated near the surface.
The inverse problem
The forward model
is LINEAR in V_s. With M observed phase velocities at frequencies and N depth nodes , this is an M × N linear system. Inversion reduces to solving
where is a 1st-derivative matrix penalising rough V_s profiles. Damped least-squares solves the normal equations
via Gauss-Seidel sweeps. We iterate to handle the constraint (positivity and physical bounds) by clipping after each step.
Try it: 1-D Vs(z) from synthetic dispersion
The widget runs:
- True V_s(z) profile: 4-layer near-surface model: 400 m/s topsoil (0-50 m), transition layer to 950 m/s (50-120 m), mid-stiffness 1600 m/s (120-220 m), and 2100 m/s bedrock below (220-400 m). Layer interfaces are placed within the kernel-sensing depth range so the dispersion data has constraining power on each one. Realistic engineering-scale MASW (Multichannel Analysis of Surface Waves) target.
- Forward modelling: 8 frequencies log-spaced over 4-40 Hz (engineering MASW band). Compute kernel-weighted c_R(f) using the formula above. These are the "observations" we will invert.
- Inversion: Damped LS with λ = 1e-3 and 1st-derivative smoothness 0.5. 50 outer iterations with 5 Gauss-Seidel sweeps each. Initial guess: uniform 1.0 km/s.
Four panels: V_s(z) profile (true cyan vs recovered orange), dispersion curve (observed dots vs predicted line), the depth-sensitivity kernels K_R(z; f) (one per frequency, low-f purple to high-f yellow), and the RMS-residual trace.
Expected behaviour:
- RMS dispersion residual drops from ~200 m/s (initial 1.0 km/s prior) to ~5 m/s after 50 iterations — 40× reduction.
- The shallow soft layer (0-100 m) is recovered most accurately because the highest-frequency kernels concentrate there.
- The deepest layer (z > 400 m) is partially undershot — the lowest-frequency kernel (4 Hz) only weakly senses below 200 m. To resolve deeper, you would need lower frequencies (1-2 Hz, common in passive ambient-noise tomography but harder to extract).
- The recovered profile is SMOOTH — sharp interfaces are blurred. This is fundamental to the band-limited kernel basis: structure finer than the kernel resolution at each depth is unresolvable.
Joint inversion: combining dispersion + travel-time + microseismic
Each data type has different sensitivity to subsurface parameters:
- Travel-time tomography (§7.4): sensitive to along ray paths. For first-arrivals these are mostly (compressional). Gives broad-volume structure.
- Surface-wave dispersion (this section): sensitive to via the kernel above. Different depth penetration per frequency. Gives from a single station — no source-receiver geometry needed.
- Microseismic catalogues (§7.5): sensitive to absolute timing (origin) and relative location. With independent V_p and V_s ratios constrains via differential times.
JOINT INVERSION combines all three under a unified model:
where each loss term is the data residual or PDE residual for the appropriate measurement and physics. Modern PINN-based joint inversions parameterise , as small MLPs, evaluate physics residuals analytically through auto-diff, and train all networks JOINTLY via a single Adam optimisation. Three concrete advantages over running three separate inversions and averaging:
- Physical consistency. The joint network has only ONE and ONE , automatically obeying any fixed ratio constraints (Poisson-solid in dry rock, lower in saturated rock).
- Cross-data regularisation. Where dispersion is poorly constrained (deep layers below the lowest frequency's kernel reach), travel-time data may compensate. Where travel-times have few rays (poorly illuminated regions of the survey), dispersion data fill in.
- Uncertainty quantification. With a Bayesian posterior over the joint network parameters (HMC, Stein VI, ensemble methods), you get a JOINT posterior over , properly accounting for the trade-offs between data types.
The downside: full joint PINN inversion is computationally heavy (hours-to-days on a single GPU) and requires careful loss-weight scheduling, since the data terms have very different magnitudes (travel-time residuals ~10 ms, dispersion residuals ~10 m/s, S-wave delay-times ~100 ms). Standard practice is to normalise each loss by its initial value, then track convergence on the per-component RMS rather than the total loss.
Looking ahead: Part 8 (operator learning)
Both PINN-FATT and PINN-dispersion solve a SPECIFIC PROBLEM (one velocity model, one survey geometry). To re-invert with different geometry, you re-train. Part 8 introduces OPERATOR LEARNING (DeepONet, FNO) which trains a neural operator to MAP velocity fields to travel-time fields directly — once trained, evaluation on a new velocity model is a single forward pass. For real-time monitoring (live earthquake catalogues, time-lapse reservoir surveys) this is transformative.
Operational notes
- Mode separation matters. Real surface-wave records contain fundamental + higher modes. The fundamental dominates at low frequencies but at high frequencies (>20 Hz) higher modes become comparable. f-k filtering or Beamforming-based picking is essential before using the dispersion data — naive picking will pick a mix of modes and produce artefacts in the recovered V_s.
- Scale matters for kernels. The kernel form used here is the small-perturbation half-space approximation. For strong contrasts (e.g., bedrock under thick soft soils, V_s ratio > 3), the linearised kernel is ~10% off and a nonlinear secular-equation update each iteration is needed for production accuracy.
- Ambient-noise tomography (Shapiro-Campillo 2004) extracts the same dispersion data without active sources by cross-correlating long noise records between station pairs. The interstation Green's function emerges from the diffuse wavefield and inverts the same way the active-source dispersion data here do. This made high-resolution V_s(x, y, z) of crusts and lithospheres globally affordable.
References
- Aki, K., Richards, P.G. (2002). Quantitative Seismology, 2nd ed., Univ. Science Books. Chapter 7 covers fundamental and higher-mode surface waves and the Haskell-Thomson method.
- Park, C.B., Miller, R.D., Xia, J. (1999). Multichannel analysis of surface waves. Geophysics 64(3), 800–808. The MASW method that established surface-wave dispersion as a standard near-surface tool.
- Shapiro, N.M., Campillo, M. (2004). Emergence of broadband Rayleigh waves from correlations of the ambient seismic noise. GRL 31, L07614. Ambient-noise tomography foundation paper.
- Snieder, R., Trampert, J. (1999). Inverse problems in geophysics, in Wavefield Inversion, A. Wirgin (ed.), Springer. Linear inverse-problem theory for dispersion + travel-time inversion.
- Pan, Y., Xia, J., Xu, Y., Gao, L. (2016). Multichannel analysis of Love waves in a 3D seismic acquisition system. Geophysics 81(5), EN67–EN74.
Part 7 wrap-up
Part 7 has covered the full travel-time and surface-wave PINN stack:
- §7.1 Eikonal physics + Fast Sweeping Method as the classical reference solver.
- §7.2 Vanilla EikoNet PINN — and its trivial-zero trap and ~5% accuracy plateau.
- §7.3 Factored eikonal — order-of-magnitude accuracy improvement via T = T_0 · τ.
- §7.4 Travel-time tomography (SIRT) — inverse for slowness from arrival times.
- §7.5 Microseismic event location (reciprocity) — inverse for source location.
- §7.6 (this section) Surface-wave dispersion inversion + joint-inversion architecture.
The reader who completes Part 7 has the toolkit to read any modern paper on PINN-based seismic-imaging or hypocentre-inversion methods. Part 8 takes the next step: instead of training one network for one specific problem, train a neural OPERATOR that maps velocity fields to travel-time fields, useful across many problems without retraining.