ML in processing: where it fits
Learning objectives
- Classify processing tasks by whether ML, physics, or a hybrid approach dominates
- Identify the data-quality and label-availability constraints that determine ML success
- Recognise the common ML architectures used in seismic processing and the tasks they fit
- Understand why ML complements rather than replaces physics-based processing
Machine learning has entered every corner of seismic processing over the past decade, but it has not replaced classical processing — it has augmented it. The question for a practitioner is not "should I use ML?" but "which tasks in my flow benefit from ML, which stay better with physics, and where is a hybrid approach the right answer?" This section maps that landscape before the next four sections dive into specific ML methods.
1. The three regimes
- Physics-dominated tasks. Migration, FWI velocity updates, deghosting. These have precise physics (wave equation, reflection coefficients, ghost impulse response) and well-validated numerical methods. ML offers no fundamental advantage here; at best it accelerates inner loops. Do not replace RTM with a neural net — you lose the physics guarantee and gain nothing.
- ML-dominated tasks. First-break picking, fault interpretation, horizon tracking. These are pattern-recognition problems with noisy training data and no clean analytical solution. Classical methods (edge detectors, correlation-based picking) are heuristic; ML (U-Net, transformer) routinely beats them on both accuracy and throughput.
- Hybrid tasks. Velocity analysis, 4D matching, residual moveout. These have decent physics-based baselines but benefit from ML refinement — typically ML-assisted auto-picking of features that a classical semblance-style method identified first.
2. The scorecard
Ten processing tasks scored on two axes: how strong the physics-based approach is, and how much value ML adds. The verdict column reflects current production practice circa 2025. Filter by verdict to see each regime in isolation.
Notice the pattern: tasks with high physics maturity (migration, FWI) are physics-dominated. Tasks with low physics maturity (fault labelling, horizon picking) are ML-dominated. The middle ground — where physics is decent but ML adds measurable value — is where most of the innovation happens.
3. ML architectures in seismic
- U-Net. The workhorse. Encoder–decoder CNN with skip connections. Used for: denoising, interpolation, first-break picking, fault labelling, horizon picking. Typically 5–10 million parameters, trained on 2D patches.
- DnCNN / Noise2Noise. CNN-based denoisers that learn to remove noise without requiring clean targets (N2N uses only noisy–noisy pairs). Important when clean ground-truth seismic is unavailable.
- Transformers. Attention-based architectures now used for global-context tasks (fault prediction on large sections, velocity model building). Computationally expensive but superior for long-range correlations.
- Physics-informed neural networks (PINNs). Enforce the wave equation as a loss term during training. Used in emerging FWI acceleration workflows; still research-stage in production.
- Diffusion models. Recent arrivals; used for generative trace-reconstruction and prior-informed FWI initialisation.
4. The training-data problem
The single largest determinant of ML success in seismic processing is training data. Key constraints:
- Labeled seismic is scarce and expensive. Human interpreters label faults and horizons one line at a time.
- Proprietary surveys. Most production seismic cannot be shared, limiting public training sets.
- Domain shift. A model trained on North Sea data may underperform on Gulf of Mexico data because wavelets, noise signatures, and geology differ.
- Synthetic training + real-data fine-tuning. Standard workaround: generate millions of synthetic examples with known ground truth, pre-train on those, then fine-tune with a small labelled real subset.
- Self-supervised learning. Noise2Noise and related approaches avoid needing clean ground truth entirely; train from pairs of noisy realisations of the same underlying signal.
5. The physics-ML spectrum in practice
A modern processing chain typically combines them:
- Pre-processing denoising: classical f-x deconvolution followed by CNN denoiser for residual random noise.
- Trace reconstruction: sparse-Radon for gap interpolation + ML refinement for complex missing-data patterns.
- First-break picking: ML network as primary, QC by human interpreter.
- Velocity analysis: semblance + ML auto-picker + human QC.
- Migration: RTM / Kirchhoff PSDM (pure physics).
- FWI: adjoint-state gradient + ML-accelerated starting models and wavelet estimation.
- Fault / horizon interpretation: ML as primary, human QC.
6. What ML cannot do in processing
- Invent physics. ML extrapolates from training data; it cannot predict scattering for geology unlike anything in its training set.
- Guarantee amplitude fidelity. Physics-based amplitude-preserving migration has provable guarantees; ML denoisers can smooth amplitudes in subtle ways that break QI workflows.
- Handle out-of-distribution data. A model trained on reasonable SNR data will fail on near-zero-SNR sections.
- Provide uncertainty. Most production ML models output point predictions; Bayesian / ensemble variants provide uncertainty but at high cost.
- Replace physical QC. An ML output always needs physics-based QC: forward-model the ML-predicted model back through the wave equation and compare to the data. If it doesn't match, the ML is wrong.
7. When to invest in ML for your flow
- A pattern-recognition task currently consuming significant interpreter time.
- A step where the classical method gives only marginal results (denoising, fault picking).
- Enough labelled training data or a realistic synthetic strategy.
- Capacity to QC ML outputs with physics (so bad ML predictions are caught).
Conversely, do NOT invest in ML for tasks where the physics-based answer is precise and fast (migration, deghosting), or where training data is genuinely impossible to generate.
ML in seismic processing lives in the tasks where physics-based methods are heuristic (denoising, first-break picking, fault labelling) and stays out of the tasks where physics is precise (migration, FWI, deghosting) — the 4 sections that follow unpack this for each dominant ML category.
Where this goes next
§9.2 covers CNN-based denoising — the most widely deployed ML technique in seismic processing today. The widget demonstrates how a U-Net learns to attenuate random and coherent noise from noisy–clean training pairs and generalises to unseen data.
References
- Yilmaz, Ö. (2001). Seismic Data Analysis (2 vols.). SEG.
- Virieux, J., Operto, S. (2009). An overview of full-waveform inversion in exploration geophysics. Geophysics, 74, WCC1.
- Claerbout, J. F. (1976). Fundamentals of Geophysical Data Processing. McGraw-Hill.