Linear algebra primer
Learning objectives
- Read a matrix as a linear function from vectors to vectors
- Connect determinant, trace, eigenvalues, and singular values to geometric facts about the transformation
- Set up and solve a least-squares problem with the normal equations and explain why regularization is usually needed
- Recognize why migration, tomography, inversion, and FWI all reduce to some flavor of Ax = b
Beyond filtering, almost every processing step is linear algebra wearing a costume. Migration is a projection. Tomography is least-squares. Inversion is a regularized solve. FWI is an outer loop of nonlinear optimization wrapped around an inner linear solve. If you cannot read Ax = b as a picture, the algorithms will look like black magic.
1. A matrix is a function
Forget rows and columns for a moment. A 2×2 matrix M = [v₁ v₂] is a rule that takes a vector x = (a, b) and returns a·v₁ + b·v₂. The matrix is its two columns; everything else is bookkeeping.
Drag the red and green arrows. Those are the images of î = (1, 0) and ĵ = (0, 1). The grid, the unit square, and the yellow circle are all transformed by the same matrix. Notice what each motion changes: the determinant, the trace, the eigenvalues.
2. Determinant
The determinant det M = ad − bc is the signed area of the transformed unit square. Consequences:
- is how much the matrix scales areas (in 2D) or volumes (in 3D).
- The sign of det M is positive if the matrix preserves orientation, negative if it reflects.
- det M = 0 means the image collapses to a line (or point). The matrix is singular and cannot be inverted. Try the “Singular!” preset — both columns line up.
3. Eigenvalues and eigenvectors
An eigenvector v of M is a nonzero direction that survives the transformation up to scaling: Mv = λv. The scalar λ is the eigenvalue. The eigenvectors are the axes along which the matrix is pure stretching; the eigenvalues tell you by how much.
For a 2×2 matrix, the eigenvalues come from the characteristic polynomial:
so . When the discriminant is negative the eigenvalues are complex — the matrix is a rotation-plus-scale and has no real-eigenvector axes.
4. Singular values and conditioning
The singular values σ₁ ≥ σ₂ ≥ … are the eigenvalues of MTM, square-rooted. They are the axes of the ellipse the matrix produces from the unit circle. The condition number is
If κ is large, the inverse problem is ill-conditioned: tiny changes in b produce huge changes in the solution x. Every inversion we will meet later has a condition number, and regularization is the tool for taming it.
5. Systems of equations: Ax = b
Given a known mapping A and an observation b, find the unknown x. Three cases:
- Square, invertible. x = A−1 b. Rare in the real world; never invert explicitly — solve instead (LU, QR, Cholesky).
- Overdetermined (more equations than unknowns). No exact solution. Use least-squares: minimize . The solution is the normal equation . This is the basic form of tomography, velocity picking, and inversion.
- Underdetermined (fewer equations than unknowns). Infinitely many solutions; you must pick one by adding a penalty like (Tikhonov) or (sparsity). Migration is often underdetermined; every migration algorithm implicitly regularizes.
6. The least-squares normal equations in one line
That one equation sits underneath velocity-analysis semblance picking, residual statics solves, surface-consistent decomposition, model-based inversion, and the inner loop of every iterative imaging algorithm. When ATA is nearly singular, you regularize:
The term ε I lifts the small singular values off the floor so the inverse exists. Tuning ε is the eternal art of inversion.
7. Why this is the workhorse for processing
- Tomographic velocity inversion. A is the ray-path matrix; x is the slowness update; b is the travel-time residuals. Least-squares with smoothing regularization, solved by conjugate gradients because ATA is huge but sparse.
- Post-stack inversion for impedance. A is the wavelet-convolution matrix; x is the impedance perturbation; b is the observed reflectivity. Same equations, different names.
- FWI. The inner linear system is an adjoint-state gradient solve; the outer loop is nonlinear. Linear algebra is in the hot core; calculus on matrices is the outer shell.
- Migration (LS-migration). Matching observed data to predicted data from a reflectivity model — least-squares on a linear forward operator.
A matrix is a function on vectors; Ax = b with regularization is the universal template for every inverse problem in processing; the condition number tells you how much noise amplification to expect.
Where this goes next
Section §0.8 brings in the other piece of every inverse problem: a model of what the noise does. You cannot set the right ε without a probabilistic framing — so random variables and noise statistics come next.
References
- Strang, G. (2016). Introduction to Linear Algebra (5th ed.). Wellesley-Cambridge.
- Claerbout, J. F. (1976). Fundamentals of Geophysical Data Processing. McGraw-Hill.
- Yilmaz, Ö. (2001). Seismic Data Analysis (2 vols.). SEG.
- Tarantola, A. (1984). Inversion of seismic reflection data in the acoustic approximation. Geophysics, 49, 1259.