Fault-Detection Dataset
Learning objectives
- Read a real modelling brief and choose an engine
- Generate labelled (section, mask) training pairs
- See why fault detection is a geometric task
- Justify convolution on cost and fitness
The Brief
Part 10 is five capstones, each a short brief you answer by choosing an engine and defending it. Here is the first. A machine-learning team wants a network that picks faults automatically. To train it they need thousands of seismic sections, each paired with a label mask that marks exactly where the faults are. Field data carries no such labels, and hand-picking thousands of them is out of the question. They ask you to build a synthetic training set.
The Build
Generate faulted earth models, convolve each to a section, and emit the fault plane as a binary mask. Every model is a complete training example: the section is the input the network sees, and the mask is the answer it must learn to produce, a free label because you drew the fault yourself. Change the number of faults, the throw, and regenerate for the endless variety a network needs to generalise.
The Debrief
Which engine? Convolution, decisively. A fault is a geometric discontinuity, and the label marks that geometry, not the diffraction physics a wave-equation solver would add. The network learns the break in the reflectors, which convolution renders perfectly well. And it renders it in milliseconds, so you can produce the tens of thousands of models a network needs. A finite-difference engine would be more realistic and about a hundred times slower, buying accuracy the task never touches.
This is the fit-for-purpose thesis of the whole course made concrete on a real deliverable: the cheapest engine that carries the label is the right one. The next capstone flips the answer. It presents a subtle stratigraphic trap where convolution quietly lies, and only the wave equation tells the truth.