Metric Spaces and Distances
Learning objectives
- Define a metric space via the three metric axioms
- Compute distances using the Euclidean, taxicab, and discrete metrics
- Recognize that distinct metrics may or may not induce the same topology
- Predict that completeness depends on the metric, while topology does not
A metric space is the right setting for almost every question in analysis that needs a notion of "distance." Metric spaces sit between general topological spaces (where only "nearness" makes sense) and (where you also have coordinates and dot products). They give you enough structure for sequences, Cauchy convergence, and completeness, while still being abstract enough to cover function spaces, sequence spaces, and the data-science notion of "similarity." Whenever you hear "Euclidean distance," "cosine similarity," or "edit distance," you are inside a metric space.
The definition
A metric space is a pair where is a set and is a function (the metric or distance) satisfying:
- (M1) Identity: .
- (M2) Symmetry: .
- (M3) Triangle inequality: for all .
Every metric induces a topology: declare open when for every there is with . The metric axioms guarantee the three topology axioms automatically. So every metric space is a topological space, but not every topological space comes from a metric (those that do are called metrizable).
Three canonical metrics on
- Euclidean: . Straight-line distance.
- Taxicab (Manhattan, ): . Distance along axis-aligned moves only.
- Chebyshev (\ell^\infty): . Maximum-coordinate-difference distance.
These three metrics are topologically equivalent: they induce the same topology on , the same open sets, the same continuous functions, the same notion of convergence. The unit balls look different (a Euclidean disc, a diamond, a square) but each one fits inside the others when scaled, which is the key to the equivalence proof.
Contrast with the discrete metric if else : it induces the discrete topology, where every singleton is open. The discrete metric is NOT topologically equivalent to the Euclidean metric on .
Try plotting in the grapher. This is the Euclidean distance from the point to minus 1, a "shifted" distance function. The grapher cannot display the unit balls of different metrics in 2D directly, but it is useful for visualising 1D distance variations as you change a parameter.
- Machine learning, cosine similarity and k-NN: Document retrieval and recommendation systems represent items as vectors and compute "similarity" via cosine distance . This is a true metric on the unit sphere (after care for the check). k-nearest-neighbours algorithms make sense in any metric space; the choice of metric (Euclidean for images, Mahalanobis for whitened features, edit distance for strings) is half the model design.
- Distance-based clustering: k-means, DBSCAN, hierarchical clustering all consume a metric. Picking the right metric (e.g. taxicab for sparse features, cosine for high-dim normalized embeddings) often matters more than the algorithm choice. The triangle inequality is what allows the spatial-indexing tricks (KD-trees, ball trees, VP-trees) that make distance queries scalable.
- Bioinformatics, sequence alignment: Edit distance, Hamming distance, and the more general Levenshtein metric measure dissimilarity between DNA, RNA, or protein sequences. These are real metrics (they satisfy M1-M3), which justifies indexing and search algorithms that depend on triangle inequalities for pruning.
Pause and think: The function on is non-negative, symmetric, and zero exactly when . Does it satisfy the triangle inequality? Try : but . So ? No, the triangle inequality fails. Squaring a metric does NOT produce a metric in general.
Try it
- Predict first: in with the taxicab metric, what does the unit ball look like? (Answer: a square rotated 45 degrees, a "diamond" with vertices at and .)
- Compute the taxicab and Euclidean distances between and . (Taxicab ; Euclidean .) Notice taxicab Euclidean in , with equality only for axis-aligned displacements.
- Predict: in the discrete metric on any set, every subset is open. Use this to argue that a sequence converges to in the discrete metric iff eventually.
- Verify the triangle inequality for cosine distance on the unit sphere. (Hint: it follows from and the Euclidean triangle inequality.)
- Trap: completeness is a metric property, NOT a topological one. The interval with the metric ... actually simpler: with usual metric is NOT complete (Cauchy sequence has no limit in the space), but is homeomorphic to , which IS complete in its usual metric. Same topology, different completeness.
A trap to watch for
"Distance" in everyday English does not always satisfy the triangle inequality. The squared-Euclidean "distance" fails it (as the pause-check showed). Many ad-hoc similarity scores in machine-learning literature (Jaccard, certain probability divergences) do not satisfy all three axioms either, KL divergence is asymmetric, so it is not a metric. Calling these "distances" loosely is harmless conversation, but algorithms that assume the triangle inequality (KD-trees, metric-tree indexing) silently break on non-metrics. Always check the axioms before using a "metric" in code.
What you now know
You can verify metric axioms for distance functions, recognize when two metrics induce the same topology, and choose between Euclidean, taxicab, Chebyshev, and discrete metrics for different applications. The next section turns to bases, a way to describe entire topologies via small, manageable collections of "primitive" open sets.
Mark section complete →
References
- Garrity, T. (2002). All the Mathematics You Missed. Cambridge UP, ch. 4.
- Munkres, J. R. (2000). Topology (2nd ed.). Prentice-Hall, ch. 2 (sections 20-21).
- Willard, S. (1970). General Topology. Addison-Wesley, ch. 8.
- Kelley, J. L. (1955). General Topology. Van Nostrand, ch. 4.
- Rudin, W. (1976). Principles of Mathematical Analysis (3rd ed.). McGraw-Hill, ch. 2.