FWI in practice: low-frequency strategies

Part 6 — Full-Waveform Inversion

Learning objectives

Explain multi-scale frequency continuation as basin-chaining
List the production tricks that stretch usable bandwidth downward (low-cut relaxation, OBN, long offsets)
Describe envelope-FWI and AWI as misfits whose basin is wider than the L2 landscape
Identify when each trick is the right one to apply

§6.1 made the problem concrete: the L2 misfit $J(V)$ has a central basin around the true model whose width scales as $1/(2f)$ . If your initial model is farther from truth than that half-period, gradient descent converges to a cycle-skipped solution. Every low-frequency strategy in production FWI is an engineering response to this constraint: either broaden the basin (by going lower in frequency, or by changing the misfit) or shrink the distance between initial model and truth (by starting from a better model).

1. Multi-scale continuation — basin chaining

The widget plots $J(V)$ at 3 Hz, 10 Hz, and 30 Hz for the same 1D toy. Pick $V_{guess} = 1500\ \text{m/s}$ (500 m/s off truth). At 3 Hz the basin runs [1714, 2400] m/s — 1500 is cycle-skipped at 3 Hz too, but barely; shift up to 1800 and it is inside the 3 Hz basin. At 10 Hz the basin has collapsed to [1905, 2105] — 1500 (and 1800) are deeply cycle-skipped. At 30 Hz the basin is only [1967, 2034] m/s wide. The implication is the entire multi-scale workflow:

Filter the observed data to the lowest available frequency band.
Invert at that low frequency until convergence — the initial model only needs to be within the wide 3 Hz basin.
Use the converged low-f model as the starting model for the next frequency band. Because the low-f inversion moved the model closer to truth, you are now inside the narrower basin at the next frequency up.
Repeat for 10 Hz, then 15 Hz, then 20 Hz, up to the maximum usable frequency of the data.

A production schedule typically uses 5–10 frequency bands, each inverted for 10–100 iterations before moving up. Computational cost grows linearly in the number of bands; the payoff is avoiding cycle skipping at every step.

2. Stretching the bottom of the bandwidth

The single most valuable piece of a marine survey for FWI is the few Hz at the lowest end of the spectrum. Every production technique below is about buying more of that:

Long offsets. Near-offset traces record steep-angle reflections with narrow frequency content. Far-offset traces pick up refracted and diving waves that carry low-f energy deep into the earth. 8–12 km streamers (vs 4–6 km standard) improve FWI dramatically.
Ocean-bottom nodes (OBN). Node receivers sit on the seafloor, avoiding the streamer notch from the ghost and picking up 1.5–3 Hz cleanly — a full octave below what streamers deliver. Many sub-salt FWI successes are from OBN data.
Broadband marine sources. Low-frequency source arrays (tuned air guns, vibrators, marine vibroseis) extend the acquired spectrum down to ~2 Hz. Add nodes and you can FWI at 1.5 Hz.
Accelerometer receivers. Conventional hydrophones measure pressure; accelerometers measure particle motion and have less low-f noise floor, extending the signal band downward by 0.5–1 Hz.
Low-cut filter relaxation. Relaxing the low-cut of the processing flow (from, say, 5 Hz to 2.5 Hz) preserves the low-f information the later stages had been throwing away.

3. Misfits whose basin is wider than L2

A different line of attack: change the shape of $J(m)$ instead of changing the data. Two families are widely used:

Envelope FWI replaces the trace with its envelope (amplitude of its analytic signal) before computing the misfit. The envelope is a slowly-varying, non-oscillatory function — its misfit has a wide, near-convex basin even at high frequency. The cost: envelope is phase-blind, so an envelope-only inversion converges to a smooth, low-resolution velocity model. Use it as a preconditioner: run envelope FWI first to move the model close, then switch to L2 FWI to recover the high-resolution detail.

Adaptive Waveform Inversion (AWI) uses a Wiener-filter-based misfit: compute the filter that maps synthetic data to observed data and penalise its deviation from a delta function. This misfit is insensitive to half-wavelength misalignments that cycle-skip the L2 norm, yet recovers the same final model as L2 once close. Developed by Warner & Guasch (2016), it has become a common starting-stage misfit in deep-water sub-salt projects.

Other phase-coherent misfits include correlation-coefficient matching, deconvolution-based misfits, and optimal-transport (Wasserstein) distances — each with a different trade-off between basin width and final resolution.

4. Data preconditioning

Time windowing. Early-arrival FWI uses only the first-break travel time, producing a robust low-resolution model. Reflection FWI uses the later reflected energy for high-resolution detail.
Offset muting. Near-offset traces are dominated by reflections that contain less velocity information. Some production FWI starts with far-offset data only (the diving-wave regime) and adds near offsets later.
Azimuth selection. In wide-azimuth marine, each azimuth samples different parts of the subsurface. Inverting one azimuth at a time can avoid some cycle-skipping pathologies.

5. Starting model quality

The other half of the cycle-skipping trade is starting closer. Investment in a better initial model directly reduces the need for aggressive low-f tricks:

High-quality tomography (§5.9) as the starting point. If the tomography model is within a few percent of truth everywhere, even 10 Hz FWI has its guess in the basin.
Well-log calibration. Tying the interval-velocity model to sonic logs at well locations constrains the model where logs exist; lateral extrapolation covers everywhere else.
Prior FWI on nearby surveys. Regional velocity models from 4D or legacy work provide a rough starting point that a freshly acquired dataset refines.

6. A realistic production schedule

Putting it all together: a typical deep-water sub-salt FWI project might run:

Tomography + well ties → initial model smooth to ~200 m scale.
1.5–2.5 Hz envelope FWI for 50 iterations → model closer to truth.
2.5–4 Hz L2 FWI on filtered OBN data, 100 iterations.
4–6 Hz L2 FWI, 100 iterations.
6–9 Hz L2 FWI, 100 iterations.
9–15 Hz L2 FWI, 50 iterations.
15–20 Hz L2 FWI if possible, 30 iterations.

Each stage's model feeds into the next. The total wall-clock time on a modern GPU cluster is several weeks per 3D survey.

**The one sentence to remember**

Low-frequency FWI strategies are all about widening the basin of attraction — by frequency continuation (basin chaining), by acquiring lower-f data (OBN, long offsets, broadband sources), by changing the misfit (envelope, AWI), or by starting closer (better tomography) — and production FWI uses every one of them in combination.

Where this goes next

§6.3 turns to the other cost axis: computational. A naive FWI at 6 Hz on a 3D survey is thousands of GPU-hours per iteration. Source encoding, shot selection, stochastic FWI, and the encoded family of techniques collapse that cost by an order of magnitude.

References

Virieux, J., Operto, S. (2009). An overview of full-waveform inversion in exploration geophysics. Geophysics, 74, WCC1.
Pratt, R. G. (1999). Seismic waveform inversion in the frequency domain, Part 1. Geophysics, 64, 888.
Tarantola, A. (1984). Inversion of seismic reflection data in the acoustic approximation. Geophysics, 49, 1259.
Yilmaz, Ö. (2001). Seismic Data Analysis (2 vols.). SEG.