Russell's Paradox and the Foundations Crisis
Learning objectives
- State Russell's paradox and trace the contradiction in the set
- Explain Burali-Forti and Richard paradoxes informally
- Describe how axiomatic ZFC restricts set comprehension to avoid the paradoxes
- Connect the paradoxes to Hilbert's programme and the foundational crisis
In 1901, Bertrand Russell discovered that the entire 19th-century edifice of "naive" set theory contained a fatal contradiction. The very rule that seemed obvious, "any property defines a set", can be used against itself to produce a set whose existence is both required and impossible. Russell's discovery triggered the early 20th-century foundational crisis. The resolution was axiomatic set theory (ZFC), where sets cannot be built by arbitrary properties but only by carefully circumscribed construction rules. The price: some intuitive collections (the "set of all sets," the "set of all ordinals") are not sets at all, they are proper classes, too big to be a set.
Russell's paradox
In naive set theory, any property defines a set . Consider the property "" (i.e., is not a member of itself). Then by naive comprehension,
is a set. Now ask: is ?
- If , then by the defining property of , the element satisfies . Contradiction.
- If , then satisfies the property defining , so . Contradiction.
Both branches yield contradictions, so the set cannot consistently exist. Therefore naive set theory itself is inconsistent, it derives outright contradictions, not merely surprises. This was the discovery that ended the brief honeymoon between Cantorian set theory and Hilbert's programme.
The Venn-diagram widget helps you visualize set relations and the self-membership question. In ZFC, the diagram of "the set of all sets that don't contain themselves" cannot be drawn coherently, it has no valid extension.
Burali-Forti and Richard: more paradoxes
Russell's paradox was not isolated. The Burali-Forti paradox (1897) considers the collection of all ordinal numbers. Every well-ordered set has an order type that is an ordinal, and the collection of ordinals less than a given ordinal is again well-ordered. If is a set, it is itself a well-ordered set of ordinals, hence its order type is an ordinal . But would then be an ordinal greater than every ordinal in , including itself, contradiction.
The Richard paradox (1905) is more subtle: it concerns the diagonal of definable real numbers. The set of all English-definable real numbers is countable (only countably many English sentences exist), so we can list them: . Construct a new real by diagonalization (changing the -th decimal of ). Then is definable in English (we just defined it!) but does not appear in the list of all definable reals. Paradox.
Resolution: axiomatic ZFC
The resolution came from Zermelo (1908), refined by Fraenkel and Skolem in the 1920s. Zermelo-Fraenkel set theory with Choice (ZFC) replaces the unrestricted comprehension axiom with a restricted version: the axiom of separation says that given an EXISTING set and a property , the collection is a set. Crucially, the dummy variable ranges over a set we already have, not over "everything."
Russell's paradox is defused: we can form for any specific set , but there is no "set of all sets" to range over, so no universal Russell set exists. The foundation also rules out entirely (every set is well-founded, by the axiom of foundation), which eliminates another source of pathology.
The collection of "all sets" still exists in some informal sense, it is a proper class, too big to be a set. Proper classes can be talked about (via formulas) but cannot themselves be members of anything.
- Type theory and programming languages: Modern functional languages (Haskell, ML, Coq, Lean) avoid Russell-style paradoxes by stratifying types into levels: a type cannot contain itself. The compiler enforces a "no self-reference at the same level" rule that is the direct descendant of Russell's ramified type theory.
- Mathematical logic: The barber paradox ("the barber shaves all and only those who do not shave themselves, does he shave himself?") is a popular Russell variant, used in philosophy and AI courses to introduce self-reference and fixed-point reasoning.
- Database design: Modern relational and graph databases enforce a "no cycles" constraint on certain referential structures, the analogue of ZFC's foundation axiom. Cyclic references in object-oriented systems require explicit cycle detection (e.g. garbage collection algorithms).
- Linguistic semantics: Tarski's undefinability of truth (1933) is essentially Richard's paradox formalized: the "set of all true sentences of a language" cannot itself be defined inside that language, on pain of self-referential contradiction.
Pause and think: The barber paradox states "the barber shaves all and only those men who do not shave themselves." Does the barber shave himself? Trace both options and see why neither works. What does this tell you about the supposed existence of such a barber?
Try it
- Predict: under naive comprehension, can the "set of all sets that have exactly 3 elements" be formed? (Hint: yes, the property "exactly 3 elements" is well-defined.) Now repeat for "the set of all sets that have more elements than themselves." Trace the contradiction.
- Use the set-venn widget to draw a Venn diagram of the sets , , and . Where would the Russell paradox set live in such a diagram? (Trick question: it cannot be drawn.)
- Explain to a friend in plain English why "the set of all sets" cannot exist in ZFC. (Hint: if it did, then by separation would exist, reviving Russell.)
- True or false: in ZFC, the collection is a set. (Answer: false; this would be the set of all sets, which is a proper class.)
A trap to watch for
People often think the resolution to Russell's paradox is "we just forbid sets from containing themselves." That is part of it (the foundation axiom), but the real fix is more fundamental: we forbid arbitrary properties from defining sets at all. Even if were not a syntactic pathology, the bigger problem is that "all things satisfying property P" is too vague to be a definition. ZFC restricts every set-formation step to a previously existing set or a clearly defined operation (pairing, union, power set, replacement). The shift from "any property defines a set" to "any property carves out a subset of an existing set" is the deeper structural change. Forgetting this leads to the common error of thinking "I can talk about all ordinals" as if it were a set; in ZFC the ordinals form a proper class, not a set, and statements quantifying over them have a different logical character.
What you now know
You can trace the contradiction in Russell's paradox, name two other paradoxes of naive set theory, explain how ZFC's axiom of separation circumvents them, and recognize the existence of proper classes as too-big-to-be-sets collections. The next section turns to the most discussed and controversial axiom in mathematics: the Axiom of Choice and its equivalents (Zorn's Lemma, the Well-Ordering Theorem).
Mark section complete →
References
- Garrity, T. (2002). All the Mathematics You Missed: But Need to Know for Graduate School. Cambridge University Press, §10.3.
- Halmos, P. R. (1960). Naive Set Theory. Springer, §2-3 (axiom of specification, Russell's paradox).
- Enderton, H. B. (1977). Elements of Set Theory. Academic Press, ch. 2.
- Jech, T. (2003). Set Theory: The Third Millennium Edition. Springer, ch. 1.
- Russell, B. (1902). Letter to Frege, June 16, 1902 (the original announcement of the paradox); reprinted in van Heijenoort (ed.), From Frege to Godel (1967), Harvard UP.