Big-O and Asymptotic Complexity
Learning objectives
- Define Big-O, Big-Omega, and Big-Theta notation formally
- Rank common growth rates from to
- Distinguish worst-case, best-case, and average-case complexity
- Analyse simple algorithms (loops, nested loops, recursion via the master theorem)
- Recognise that constants and lower-order terms vanish in asymptotic analysis
Computational complexity is the calculus of algorithms. Just as differential calculus measures how functions change at the small scale, asymptotic analysis measures how an algorithm's resource use grows as the input size goes to infinity. The dominant resources are time (how many elementary operations) and space (how much memory). The point of the analysis is portability: an algorithm is slow on every machine of every era, and an algorithm is fast on every machine of every era, even though the absolute speeds change by factors of millions.
Big-O and friends
For functions we write when there exist constants and such that
.
This is an upper bound: grows AT MOST as fast as (up to a constant). The symmetric lower bound is , meaning for all large . Combining both gives the tight bound , same growth rate up to constants.
The strict inequalities are the little-o and little-omega: when (strictly slower) and when (strictly faster).
The asymptotic-growth ladder
The standard complexity classes form a chain, each growing dramatically faster than the previous:
- , constant. Array indexing, hash-table lookup (amortised).
- , logarithmic. Binary search. Each step halves the search space.
- , sub-linear, polynomial. Trial division up to to test primality.
- , linear. A single pass through input. Optimal for problems that must read everything.
- , linearithmic. Comparison-based sorting (merge sort, heap sort); FFT.
- , quadratic. Two nested linear loops. Manageable up to ; painful above.
- , cubic. Three nested loops. Naive matrix multiplication. Crashes through reasonable runtime around .
- , exponential. Brute-force subset enumeration. Useless past .
- , factorial. Brute-force permutation enumeration. Useless past .
The gap between and is the central divide of complexity theory: anything in the former family is "tractable" in a robust, machine-independent sense; anything that genuinely lives in the latter (and only the latter) is considered intractable. The P-vs-NP question of §16.4 asks exactly which side of this gap many natural problems live on.
Worst, best, and average case
For a fixed algorithm and input size , there are many inputs. The complexity depends on which one we pick:
- Worst case: maximum over all inputs of size . The standard, conservative measure.
- Best case: minimum. Often trivially fast and not very informative.
- Average case: expected value under some probability distribution on inputs. Quicksort's average is even though its worst case is .
Quicksort's gap between worst and average is the canonical example of why average-case analysis matters: in practice you almost never hit the worst case, and the algorithm is one of the fastest sorts known.
Recurrences and the master theorem
Many divide-and-conquer algorithms satisfy recurrences of the form , subproblems each of size plus a per-call overhead . The master theorem gives a clean answer:
- If with , then .
- If with , then .
- If with (and a regularity condition), then .
Merge sort: falls in case 2 with , giving . Binary search: falls in case 2 with , giving .
Loosely, picture the complexity hierarchy as nested classes: . Each inclusion is either proven or widely believed strict; the most famous unresolved question is whether . (See §16.4 for the formal definitions.)
- Database index choice: Search on a B-tree-indexed column is ; on an unindexed column it is . For a 1-billion-row table the difference is between 30 disk reads and 1 billion, a factor of . Almost every production database performance issue is an index-vs-table-scan complexity gap.
- Machine-learning training time: Training a neural network with parameters on examples is per epoch; the full training is . Modern LLMs have and , so the FLOPs of training are very much an asymptotic-analysis question first and an engineering question second.
- Streaming algorithms: When data is too large to store, you need algorithms whose space complexity is or . Count-Min Sketch, HyperLogLog, and Bloom filters are the canonical examples, standard tooling at Google, Meta, and every other large-data company.
- Cryptographic key sizes: RSA-2048 means a 2048-bit modulus; brute-forcing factoring is conjectured to be time (the number-field sieve). Doubling the key length roughly cubes the cracking time, a complexity-class-driven security calculus.
- Compilers and big-O guarantees: Modern JIT compilers (V8, JVM) instrument code at runtime and re-optimise hot loops. Their effectiveness depends on the asymptotic behaviour of the algorithm being right; constant-factor improvements compose, but complexity-class regressions are devastating.
Pause and think: If an algorithm with running time is asymptotically , why do we drop the and terms? Why is even the irrelevant for asymptotic analysis? (Hint: think about what happens for .)
Try it
- Show formally that . Find explicit constants and .
- Rank in growth order: . (Answer: .)
- Use the master theorem on . (Answer: , so case 1: . This is Karatsuba multiplication.)
- You have an algorithm that runs in 1 second on input of size at . How long will it take on input of size ?
- True or false: . . Justify each.
A trap to watch for
Big-O is an UPPER bound, not an equality. Writing is technically correct but loses information, it overstates the algorithm's growth. Use when you mean "this is the tight growth rate." Conversely, does NOT mean is exactly linear; it could be or even . Big-O is a one-sided inequality dressed in equality's clothing. The notation is widely (mis)used in practice; in formal papers, prefer as a set-membership statement.
What you now know
You can read and write Big-O / / statements rigorously, rank common growth rates, distinguish worst- from average-case, and analyse divide-and-conquer recurrences via the master theorem. The next section applies this toolkit to graph algorithms (BFS, DFS, Dijkstra); after that we look at sorting and finally tackle the P-vs-NP question that organises the entire field.
Mark section complete →
References
- Garrity, T. (2002). All the Mathematics You Missed. Cambridge University Press, ch. 16.
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press, ch. 3–4.
- Sipser, M. (2012). Introduction to the Theory of Computation (3rd ed.). Cengage, ch. 7.
- Sedgewick, R., Wayne, K. (2011). Algorithms (4th ed.). Addison-Wesley, ch. 1.4.
- Knuth, D. E. (1997). The Art of Computer Programming, Vol. 1 (3rd ed.). Addison-Wesley, §1.2 (asymptotic notation).