Reformatting & geometry QC
Learning objectives
- Describe the standard SEG-Y → internal-format reformatting workflow and what can go wrong
- Recognize the fold-map signature of the four most common geometry bugs
- Run the ten-minute QC checklist every processor applies to a new dataset
- Understand why fixing geometry at Day 1 costs a morning, and why finding it at Day 30 costs the project
Part 2 is the chain of operations between “data exists on a hard drive” and “velocity analysis can start.” None of it is glamorous. All of it is essential. The first step is the un-sexiest of all: reformatting — converting the vendor-delivered SEG-Y into the processing system’s internal format — and its conjoined twin geometry QC, where you verify that every trace is what its header says it is.
1. Reformatting in ten bullet points
- Read SEG-Y files from the vendor (often multi-terabyte).
- Parse the 3200-byte textual header — document any peculiarities.
- Parse the 400-byte binary header — extract sample rate, trace count, format code (IBM floats, IEEE, 16-bit int, etc.).
- For each trace, parse its 240-byte trace header plus N samples.
- Apply the coordinate scalar from bytes 71–72: if +100 or +1000, multiply; if −100 or −1000, divide.
- Handle byte-order correctly — SEG-Y spec says big-endian, but many legacy writers emit little-endian.
- Re-pack into the internal format (PROMAX, Petrel, OpenCPS, Madagascar, SeisSpace, etc.).
- Check trace count against the vendor report — a mismatch means traces were lost, duplicated, or the format spec was wrong.
- Sanity-check sample rate, sample count, and record length — every downstream operator uses these.
- Output a first geometry QC report — fold map, shot map, receiver map, histograms of trace header values.
2. Six fold maps to recognize at a glance
The fold map is the single most informative diagnostic. If you can only display one picture after reformatting, make it a fold map and compare it to the survey-design planned fold. The widget below shows the same synthetic survey under six different header-loading scenarios — study the signatures until you can name them from across the room.
3. What to look for
- Smooth trapezoidal plateau — geometry loaded cleanly. Max fold matches survey design. Edge ramps equal half the spread length. Proceed.
- Everything piled near origin — coordinate scalar applied in the wrong direction. Multiply or divide coordinates by the scalar value (100 or 1000) and re-load.
- Fold map shifted from survey plan — CMP field in header is stale. Re-compute CMP from shot/receiver X-Y in the loader, discard stored CMP.
- Random-looking speckle — byte-order wrong. Flip endian interpretation of the coordinate integers.
- Stripe or hole in fold — traces missing for a subset of shots/receivers. Check against the vendor report; re-request the missing SEG-Y file.
- Unusual offset histogram — shot/receiver fields might be swapped. Display a |x_r − x_s| histogram; negative or implausibly large offsets are a red flag.
4. The ten-minute QC checklist
Run these in order on every new dataset. Stop at the first failure and fix before continuing.
- Fold map matches survey-design planned fold.
- Shot map shows every planned shot line and spacing.
- Receiver map shows every planned receiver line and spacing.
- Offset histogram peaks within the planned offset range; no negative or huge values.
- Azimuth rose matches the survey design.
- Sample rate matches the vendor report (e.g. 2 ms).
- Trace length matches the vendor report (e.g. 6 s).
- Total trace count matches the vendor report ± 0.1 %.
- First-break time range is plausible for the target depths.
- A sample shot gather shows primaries at expected t₀ values.
Only after all ten pass do you proceed to any further processing. Every seismic project that went sideways in production started with a QC item that was silently skipped.
5. Why the investment pays
Every minute spent here is a day saved later. A scalar bug that goes undiagnosed for two weeks creates two weeks of processing products on the wrong geometry — velocity picks, statics, migrations — all of which need to be redone. A missing receiver line noticed in geometry QC takes a phone call; noticed at interpretation, it takes a reshoot.
A fold map is the cheapest and most informative seismic plot you can make; run the ten-point QC checklist before any downstream processing, every time.
Where this goes next
Section §2.2 starts the actual waveform processing. Trace editing removes bad samples and dead traces; amplitude recovery compensates for the known physical decay of wave amplitudes with travel time. Both are simple operations that every later step depends on.
References
- Sheriff, R. E., Geldart, L. P. (1995). Exploration Seismology (2nd ed.). Cambridge UP.
- Yilmaz, Ö. (2001). Seismic Data Analysis (2 vols.). SEG.
- Claerbout, J. F. (1976). Fundamentals of Geophysical Data Processing. McGraw-Hill.