Linear algebra primer

Processing Prerequisites

Learning objectives

Read a matrix as a linear function from vectors to vectors
Connect determinant, trace, eigenvalues, and singular values to geometric facts about the transformation
Set up and solve a least-squares problem with the normal equations and explain why regularization is usually needed
Recognize why migration, tomography, inversion, and FWI all reduce to some flavor of Ax = b

Beyond filtering, almost every processing step is linear algebra wearing a costume. Migration is a projection. Tomography is least-squares. Inversion is a regularized solve. FWI is an outer loop of nonlinear optimization wrapped around an inner linear solve. If you cannot read Ax = b as a picture, the algorithms will look like black magic.

1. A matrix is a function

Forget rows and columns for a moment. A 2×2 matrix M = [v₁ v₂] is a rule that takes a vector x = (a, b) and returns a·v₁ + b·v₂. The matrix is its two columns; everything else is bookkeeping.

Drag the red and green arrows. Those are the images of î = (1, 0) and ĵ = (0, 1). The grid, the unit square, and the yellow circle are all transformed by the same matrix. Notice what each motion changes: the determinant, the trace, the eigenvalues.

2. Determinant

The determinant det M = ad − bc is the signed area of the transformed unit square. Consequences:

$|\det M|$ is how much the matrix scales areas (in 2D) or volumes (in 3D).
The sign of det M is positive if the matrix preserves orientation, negative if it reflects.
det M = 0 means the image collapses to a line (or point). The matrix is singular and cannot be inverted. Try the “Singular!” preset — both columns line up.

3. Eigenvalues and eigenvectors

An eigenvector v of M is a nonzero direction that survives the transformation up to scaling: Mv = λv. The scalar λ is the eigenvalue. The eigenvectors are the axes along which the matrix is pure stretching; the eigenvalues tell you by how much.

For a 2×2 matrix, the eigenvalues come from the characteristic polynomial:

\lambda^{2} - (\operatorname{tr} M)\,\lambda + \det M \;=\; 0

so $\lambda = \tfrac{1}{2}\bigl(\operatorname{tr} M \pm \sqrt{(\operatorname{tr} M)^{2} - 4\det M}\bigr)$ . When the discriminant is negative the eigenvalues are complex — the matrix is a rotation-plus-scale and has no real-eigenvector axes.

4. Singular values and conditioning

The singular values σ₁ ≥ σ₂ ≥ … are the eigenvalues of M^TM, square-rooted. They are the axes of the ellipse the matrix produces from the unit circle. The condition number is

\kappa(M) \;=\; \frac{\sigma_{\max}}{\sigma_{\min}}

If κ is large, the inverse problem is ill-conditioned: tiny changes in b produce huge changes in the solution x. Every inversion we will meet later has a condition number, and regularization is the tool for taming it.

5. Systems of equations: Ax = b

Given a known mapping A and an observation b, find the unknown x. Three cases:

Square, invertible. x = A⁻¹ b. Rare in the real world; never invert explicitly — solve instead (LU, QR, Cholesky).
Overdetermined (more equations than unknowns). No exact solution. Use least-squares: minimize $|Ax - b|_2^2$ . The solution is the normal equation $A^{T}A,x = A^{T}b$ . This is the basic form of tomography, velocity picking, and inversion.
Underdetermined (fewer equations than unknowns). Infinitely many solutions; you must pick one by adding a penalty like $|x|_2^2$ (Tikhonov) or $|x|_1$ (sparsity). Migration is often underdetermined; every migration algorithm implicitly regularizes.

6. The least-squares normal equations in one line

\hat x \;=\; (A^{T}A)^{-1}\,A^{T} b

That one equation sits underneath velocity-analysis semblance picking, residual statics solves, surface-consistent decomposition, model-based inversion, and the inner loop of every iterative imaging algorithm. When A^TA is nearly singular, you regularize:

\hat x_{\text{reg}} \;=\; (A^{T}A + \epsilon I)^{-1}\,A^{T} b

The term ε I lifts the small singular values off the floor so the inverse exists. Tuning ε is the eternal art of inversion.

7. Why this is the workhorse for processing

Tomographic velocity inversion. A is the ray-path matrix; x is the slowness update; b is the travel-time residuals. Least-squares with smoothing regularization, solved by conjugate gradients because A^TA is huge but sparse.
Post-stack inversion for impedance. A is the wavelet-convolution matrix; x is the impedance perturbation; b is the observed reflectivity. Same equations, different names.
FWI. The inner linear system is an adjoint-state gradient solve; the outer loop is nonlinear. Linear algebra is in the hot core; calculus on matrices is the outer shell.
Migration (LS-migration). Matching observed data to predicted data from a reflectivity model — least-squares on a linear forward operator.

**The one sentence to remember**

A matrix is a function on vectors; Ax = b with regularization is the universal template for every inverse problem in processing; the condition number tells you how much noise amplification to expect.

Where this goes next

Section §0.8 brings in the other piece of every inverse problem: a model of what the noise does. You cannot set the right ε without a probabilistic framing — so random variables and noise statistics come next.

References

Strang, G. (2016). Introduction to Linear Algebra (5th ed.). Wellesley-Cambridge.
Claerbout, J. F. (1976). Fundamentals of Geophysical Data Processing. McGraw-Hill.
Yilmaz, Ö. (2001). Seismic Data Analysis (2 vols.). SEG.
Tarantola, A. (1984). Inversion of seismic reflection data in the acoustic approximation. Geophysics, 49, 1259.