ML in processing: where it fits

Part 9 — Machine Learning in Processing

Learning objectives

Classify processing tasks by whether ML, physics, or a hybrid approach dominates
Identify the data-quality and label-availability constraints that determine ML success
Recognise the common ML architectures used in seismic processing and the tasks they fit
Understand why ML complements rather than replaces physics-based processing

Machine learning has entered every corner of seismic processing over the past decade, but it has not replaced classical processing — it has augmented it. The question for a practitioner is not "should I use ML?" but "which tasks in my flow benefit from ML, which stay better with physics, and where is a hybrid approach the right answer?" This section maps that landscape before the next four sections dive into specific ML methods.

1. The three regimes

Physics-dominated tasks. Migration, FWI velocity updates, deghosting. These have precise physics (wave equation, reflection coefficients, ghost impulse response) and well-validated numerical methods. ML offers no fundamental advantage here; at best it accelerates inner loops. Do not replace RTM with a neural net — you lose the physics guarantee and gain nothing.
ML-dominated tasks. First-break picking, fault interpretation, horizon tracking. These are pattern-recognition problems with noisy training data and no clean analytical solution. Classical methods (edge detectors, correlation-based picking) are heuristic; ML (U-Net, transformer) routinely beats them on both accuracy and throughput.
Hybrid tasks. Velocity analysis, 4D matching, residual moveout. These have decent physics-based baselines but benefit from ML refinement — typically ML-assisted auto-picking of features that a classical semblance-style method identified first.

2. The scorecard

Ten processing tasks scored on two axes: how strong the physics-based approach is, and how much value ML adds. The verdict column reflects current production practice circa 2025. Filter by verdict to see each regime in isolation.

Notice the pattern: tasks with high physics maturity (migration, FWI) are physics-dominated. Tasks with low physics maturity (fault labelling, horizon picking) are ML-dominated. The middle ground — where physics is decent but ML adds measurable value — is where most of the innovation happens.

3. ML architectures in seismic

U-Net. The workhorse. Encoder–decoder CNN with skip connections. Used for: denoising, interpolation, first-break picking, fault labelling, horizon picking. Typically 5–10 million parameters, trained on 2D patches.
DnCNN / Noise2Noise. CNN-based denoisers that learn to remove noise without requiring clean targets (N2N uses only noisy–noisy pairs). Important when clean ground-truth seismic is unavailable.
Transformers. Attention-based architectures now used for global-context tasks (fault prediction on large sections, velocity model building). Computationally expensive but superior for long-range correlations.
Physics-informed neural networks (PINNs). Enforce the wave equation as a loss term during training. Used in emerging FWI acceleration workflows; still research-stage in production.
Diffusion models. Recent arrivals; used for generative trace-reconstruction and prior-informed FWI initialisation.

4. The training-data problem

The single largest determinant of ML success in seismic processing is training data. Key constraints:

Labeled seismic is scarce and expensive. Human interpreters label faults and horizons one line at a time.
Proprietary surveys. Most production seismic cannot be shared, limiting public training sets.
Domain shift. A model trained on North Sea data may underperform on Gulf of Mexico data because wavelets, noise signatures, and geology differ.
Synthetic training + real-data fine-tuning. Standard workaround: generate millions of synthetic examples with known ground truth, pre-train on those, then fine-tune with a small labelled real subset.
Self-supervised learning. Noise2Noise and related approaches avoid needing clean ground truth entirely; train from pairs of noisy realisations of the same underlying signal.

5. The physics-ML spectrum in practice

A modern processing chain typically combines them:

Pre-processing denoising: classical f-x deconvolution followed by CNN denoiser for residual random noise.
Trace reconstruction: sparse-Radon for gap interpolation + ML refinement for complex missing-data patterns.
First-break picking: ML network as primary, QC by human interpreter.
Velocity analysis: semblance + ML auto-picker + human QC.
Migration: RTM / Kirchhoff PSDM (pure physics).
FWI: adjoint-state gradient + ML-accelerated starting models and wavelet estimation.
Fault / horizon interpretation: ML as primary, human QC.

6. What ML cannot do in processing

Invent physics. ML extrapolates from training data; it cannot predict scattering for geology unlike anything in its training set.
Guarantee amplitude fidelity. Physics-based amplitude-preserving migration has provable guarantees; ML denoisers can smooth amplitudes in subtle ways that break QI workflows.
Handle out-of-distribution data. A model trained on reasonable SNR data will fail on near-zero-SNR sections.
Provide uncertainty. Most production ML models output point predictions; Bayesian / ensemble variants provide uncertainty but at high cost.
Replace physical QC. An ML output always needs physics-based QC: forward-model the ML-predicted model back through the wave equation and compare to the data. If it doesn't match, the ML is wrong.

7. When to invest in ML for your flow

A pattern-recognition task currently consuming significant interpreter time.
A step where the classical method gives only marginal results (denoising, fault picking).
Enough labelled training data or a realistic synthetic strategy.
Capacity to QC ML outputs with physics (so bad ML predictions are caught).

Conversely, do NOT invest in ML for tasks where the physics-based answer is precise and fast (migration, deghosting), or where training data is genuinely impossible to generate.

**The one sentence to remember**

ML in seismic processing lives in the tasks where physics-based methods are heuristic (denoising, first-break picking, fault labelling) and stays out of the tasks where physics is precise (migration, FWI, deghosting) — the 4 sections that follow unpack this for each dominant ML category.

Where this goes next

§9.2 covers CNN-based denoising — the most widely deployed ML technique in seismic processing today. The widget demonstrates how a U-Net learns to attenuate random and coherent noise from noisy–clean training pairs and generalises to unseen data.

References

Yilmaz, Ö. (2001). Seismic Data Analysis (2 vols.). SEG.
Virieux, J., Operto, S. (2009). An overview of full-waveform inversion in exploration geophysics. Geophysics, 74, WCC1.
Claerbout, J. F. (1976). Fundamentals of Geophysical Data Processing. McGraw-Hill.