Linear Maps and Matrix Representations

Part 1, Chapter 1: Linear Algebra Toolkit

Learning objectives

Define a linear transformation $T:\mathbb{R}^n \to \mathbb{R}^m$ and recognise it from its action on the standard basis
Construct the $m \times n$ matrix that represents a given linear map
Compute the kernel (null space) and image (column space) of a matrix
Connect injectivity to trivial kernel and surjectivity to full rank

The single most important fact in linear algebra is this: every linear map between finite-dimensional spaces is a matrix multiplication. Once you absorb this, all the geometric language of "rotations", "shears", "projections", and "stretches" becomes a single computational operation, multiply by an $m \times n$ matrix, and all the questions you can ask about the map (is it injective? surjective? invertible?) become questions about that matrix. This compression is why linear algebra is the foundation of every applied discipline that crunches numbers.

The definition and the matrix

A linear transformation $T : \mathbb{R}^n \to \mathbb{R}^m$ is a function satisfying $T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})$ and $T(c \mathbf{u}) = c\, T(\mathbf{u})$ . The fundamental representation theorem says: every such $T$ can be written uniquely as $T(\mathbf{x}) = A \mathbf{x}$ for some $m \times n$ matrix $A$ . The columns of $A$ are precisely the images $T(\mathbf{e}_1), \ldots, T(\mathbf{e}_n)$ of the standard basis vectors. Once you know where the basis vectors go, linearity forces $T$ everywhere.

Kernel and image

The kernel (or null space) of $T$ is $\ker(T) = \{\mathbf{x} \in \mathbb{R}^n : T(\mathbf{x}) = \mathbf{0}\}$ . It measures how far $T$ is from being injective: if $\ker(T) = \{\mathbf{0}\}$ then $T$ is injective; if $\ker(T)$ is larger, then many inputs collapse to zero (and to every other output value too). The image $\operatorname{im}(T) = \{T(\mathbf{x}) : \mathbf{x} \in \mathbb{R}^n\}$ is the set of all outputs, equivalently, the column space of $A$ . The map is surjective iff its image equals the entire codomain $\mathbb{R}^m$ .

Computing both

To find $\ker(A)$ , row-reduce $A$ and solve $A \mathbf{x} = \mathbf{0}$ for the free variables. To find $\operatorname{im}(A)$ , identify the pivot columns of $A$ ; their original counterparts form a basis for the column space. These two subspaces are related by the Rank-Nullity Theorem (Section 1.6): $\dim(\ker A) + \dim(\operatorname{im} A) = n$ .

Use the matrix widget above to set a singular matrix, say columns $(1, 2)$ and $(2, 4)$ . The unit square collapses to a line: that is the image of $T$ . Any vector perpendicular to that line, e.g. $(2, -1)$ , lies in the kernel. Notice that the rank (dimension of the image) is 1 and the nullity (dimension of the kernel) is 1, summing to 2 = number of input columns.

Where this shows up

Neural-network layers: A single dense layer is the affine map $\mathbf{y} = W \mathbf{x} + \mathbf{b}$ . The matrix $W$ is exactly the matrix of the linear part, trained by gradient descent on millions of examples. Modern transformers stack hundreds of these layers.
Computer graphics: A rotation by angle $\theta$ in $\mathbb{R}^2$ is the matrix $\begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}$ . Every pixel transformation in a game (translate, rotate, scale, project) is matrix multiplication done at 60 Hz.
Robotics inverse kinematics: The end-effector pose of a robot arm is a non-linear function of the joint angles, but its derivative is the Jacobian matrix, a linear map from joint-velocity space to end-effector-velocity space. Inverting this matrix tells the robot how to move its joints to reach a target.

Pause and think: If $T : \mathbb{R}^5 \to \mathbb{R}^3$ is linear, can it possibly be injective? Use a dimension argument before reaching for a matrix calculation. (Hint: how many independent vectors can fit inside $\mathbb{R}^3$ ?)

Try it

Before writing the matrix, predict whether $T(x, y) = (x + 2y,\ 3x - y)$ is injective. Then verify by computing $\det A$ where $A$ is the $2 \times 2$ matrix of $T$ .
Predict: what is the kernel of $A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}$ ? Then row-reduce to verify.
For $T(x, y, z) = (x + y, y + z)$ , predict whether $T$ is surjective onto $\mathbb{R}^2$ . Verify by checking the rank.
Compute the matrix of the composition $S \circ T$ where $T(x, y) = (2x, y)$ and $S(x, y) = (x + y, x - y)$ . Predict the entries by tracking where $\mathbf{e}_1, \mathbf{e}_2$ go.

A trap to watch for

Beginners often write the matrix of $T(x, y) = (a x + b y,\ c x + d y)$ as $\begin{pmatrix} a & c \\ b & d \end{pmatrix}$ (with rows holding the coefficients of $x$ and $y$ separately). This is the transpose of the correct matrix. The correct convention is: row $i$ of $A$ gives the $i$ -th output coordinate as a linear combination of the inputs. So $A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$ , with the first row encoding the first output. Equivalently, column $j$ is the image of $\mathbf{e}_j$ j. Getting this wrong silently inverts your geometry, a clockwise rotation will run counter-clockwise.

What you now know

You can write the matrix of any linear transformation, compute its kernel by row-reduction, and identify its image as the column space. The next section (Section 1.4) tackles bases and dimension: how to measure the size of a subspace and pick the most efficient coordinate system for a given problem.

Mark section complete →

References

Garrity, T. (2002). All the Mathematics You Missed. Cambridge University Press, ch. 1, Section 1.3.
Strang, G. (2016). Introduction to Linear Algebra (5th ed.). Wellesley-Cambridge, ch. 2 (linear transformations and matrices).
Axler, S. (2015). Linear Algebra Done Right (3rd ed.). Springer, ch. 3 (linear maps).
Hoffman, K., Kunze, R. (1971). Linear Algebra (2nd ed.). Prentice-Hall, ch. 3.
Lay, D. C. (2015). Linear Algebra and Its Applications (5th ed.). Pearson, ch. 1 (matrix equations) and ch. 2.