Sorting Algorithms as a Concrete Starting Point

Part 16, Chapter 16: Algorithms and Complexity

Learning objectives

Describe bubble sort, merge sort, and quicksort and their time complexities
Prove the $\Omega(n \log n)$ lower bound for comparison-based sorting via the decision-tree argument
Distinguish stable from unstable sorts and in-place from out-of-place sorts
Recognise when non-comparison sorts (counting, radix, bucket) bypass the lower bound
Apply the master theorem to sorting recurrences

Sorting is the most-studied problem in algorithms, and for good reason. It is fundamental enough that you encounter it in every applied setting (databases, file systems, search engines), structurally rich enough to support a dozen distinct algorithms with different trade-offs, and theoretically deep enough to host one of the cleanest lower-bound proofs in computer science. The story has a moral arc: bubble sort's $O(n^{2})$ horrors give way to merge sort's $O(n \log n)$ elegance, the lower bound proves we cannot do better with comparisons alone, and non-comparison sorts (counting, radix) show that breaking out of the comparison model achieves $O(n)$ .

Bubble sort: simple and slow

Repeatedly scan the array; for each adjacent pair, swap if out of order. After one pass, the largest element has "bubbled" to the end. After $n - 1$ passes, the array is sorted. Worst- and average-case time is $O(n^{2})$ ; best case (already sorted, with early-termination) is $O(n)$ . Bubble sort is the canonical pedagogical example of a slow algorithm, useful for teaching, almost never used in practice.

Merge sort: divide and conquer

Merge sort embodies the divide-and-conquer paradigm:

If the array has 0 or 1 elements, it is already sorted, return.
Split the array in half.
Recursively sort each half.
Merge the two sorted halves into one sorted array in $O(n)$ .

The recurrence is $T(n) = 2 T(n/2) + O(n)$ , which by the master theorem (case 2) gives $T(n) = \Theta(n \log n)$ , the optimal comparison-sort time. Merge sort is STABLE (equal elements keep their relative order) and is the standard sort in many languages' standard libraries (e.g. Python's Timsort is merge-sort-based with run detection). Its main downside is the $O(n)$ auxiliary space for the merge step.

Quicksort: in-place divide and conquer

Quicksort chooses a PIVOT and partitions the array into elements less than the pivot, equal to it, and greater. Then it recursively sorts the less and greater partitions in place. The average-case recurrence is $T(n) = 2 T(n/2) + O(n) = O(n \log n)$ , BUT the worst case (pivot always extreme) is $T(n) = T(n - 1) + O(n) = O(n^{2})$ .

In practice, randomised pivot selection (or median-of-three sampling) makes the worst case astronomically unlikely. Quicksort is typically FASTER than merge sort in real systems despite its weaker worst-case guarantee because it has excellent CACHE behaviour and operates in place.

The $\Omega(n \log n)$ lower bound

Theorem. Any comparison-based sorting algorithm requires at least $\Omega(n \log n)$ comparisons in the worst case.

Proof sketch. Model the algorithm as a binary decision tree: internal nodes are comparisons, branches are outcomes (less / equal / greater), leaves are permutations representing final orderings. There are $n!$ possible orderings of $n$ distinct elements, so the tree has at least $n!$ leaves. A binary tree with $L$ leaves has height at least $\log_{2} L$ 2L, so the worst-case depth (= worst-case comparison count) is at least $\log_{2}(n!) = \Theta(n \log n)$ 2(n!)=Theta(nlogn) by Stirling's approximation $n! \sim \sqrt{2\pi n} (n/e)^{n}$ . $\blacksquare$

Merge sort (and average-case quicksort) match this lower bound up to constants. So they are asymptotically optimal in the comparison model.

Beyond comparisons: linear-time sorts

The lower bound applies ONLY to comparison-based algorithms. If you know more about the inputs, that they are integers in a bounded range, or that they have a fixed-length string representation, you can do better:

Counting sort: for integers in $\{0, 1, \ldots, k\}$ , allocate a count array of size $k + 1$ , tally each value, and reconstruct the sorted output. Time $O(n + k)$ , linear when $k = O(n)$ .
Radix sort: sort by least-significant digit first using a stable sort (typically counting sort), repeating for each digit. Time $O(d (n + b))$ for $n$ integers of $d$ digits in base $b$ .
Bucket sort: assume inputs are uniformly distributed, distribute into $n$ buckets, sort each bucket, concatenate. Expected $O(n)$ .

These do not contradict the lower bound, they simply break the assumption (they use the actual value of items as array indices, not just pairwise comparisons).

Where this shows up

Databases, B-tree indexes: SQL ORDER BY uses external merge sort on disk-resident data, merging sorted runs of size larger than memory. Database query planners account for sort cost when choosing execution strategies.
Search engines: Document ranking returns the top- $k$ results by score, a partial sort. Heap-based selection runs in $O(n \log k)$ , much faster than full sorting.
Operating-system schedulers: Process queues are partially sorted by priority via heap structures. Each enqueue/dequeue is $O(\log n)$ , the sorting lower bound amortised over time.
Computer graphics, transparency: Rendering semi-transparent objects requires depth-sorting per pixel (the "painter's algorithm"). For complex scenes the per-frame sort cost is significant; A-buffer and order-independent transparency aim to sidestep it.
Genomics: Aligning short DNA reads to a reference genome uses suffix-array sorts (Burrows-Wheeler transform) running in $O(n \log n)$ . The entire BWA / Bowtie / minimap2 toolchain depends on optimal-comparison sorting.

Pause and think: Stability in sorting means equal elements keep their input order. Why does stability matter when sorting by MULTIPLE keys (e.g., sort by last name, then by first name)? Hint: trace through a small example sorting tuples by second component first, then by first component.

Try it

Trace merge sort on $[5, 2, 4, 6, 1, 3, 2, 6]$ . Draw the recursion tree and label the merge steps.
Show that the WORST case of quicksort with first-element pivot is an already-sorted array.
Apply the master theorem to $T(n) = T(n/2) + O(n)$ , this is the recurrence for selection (median-finding) with a good pivot.
Given an array of $n$ integers in $\{0, \ldots, 99\}$ , would you use comparison sort or counting sort? Compute the operation count for each at $n = 10^{6}$ .
Is bubble sort stable? Is quicksort stable? (Hint: stability depends on whether equal-key swaps occur during the algorithm.)

A trap to watch for

Quicksort's $O(n^{2})$ worst case is a real production hazard if you implement it with deterministic pivot selection (e.g., always pick the first element). An adversary who knows your pivot rule can craft inputs that force the worst case, turning a Mb of data into a denial-of-service. The fix is RANDOMISED pivot selection or median-of-three sampling. This was a real CVE class in early Java HashMap implementations whose hash-collision behaviour degenerated to $O(n^{2})$ on adversarial inputs; the workaround was to switch collision buckets from linked lists to balanced trees once they exceeded a threshold.

What you now know

You can describe and analyse the three canonical comparison sorts (bubble, merge, quick), prove the $\Omega(n \log n)$ lower bound via decision trees, recognise when non-comparison sorts apply, and distinguish stability and in-place properties. The next section confronts the most famous open question in algorithm theory: which problems live above the polynomial-time line at all.

Mark section complete →

References

Garrity, T. (2002). All the Mathematics You Missed. Cambridge University Press, ch. 16.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press, ch. 6-8.
Knuth, D. E. (1998). The Art of Computer Programming, Vol. 3: Sorting and Searching (2nd ed.). Addison-Wesley.
Sedgewick, R., Wayne, K. (2011). Algorithms (4th ed.). Addison-Wesley, ch. 2.
Hoare, C. A. R. (1962). "Quicksort." The Computer Journal, 5(1): 10-16.