Polygonal declustering

Part 2 — Declustering

Learning objectives

State the Voronoi / Thiessen construction: for a sample set {s_1, ..., s_n} in a 2D (or 3D) domain, sample i's Voronoi polygon is the set of all locations closer to s_i than to any other sample s_j — together the polygons exactly partition the domain (no gaps, no overlaps)
Write down the polygonal-declustering weight w_i = A_i / Σ_j A_j where A_i is the area (3D: volume) of sample i's Voronoi polygon clipped to the domain — the weights sum to 1 by construction, isolated samples get larger weights, clustered samples get smaller ones
Recognise the structural advantage over cell declustering: no cell-size parameter to tune, no grid-origin sensitivity, the geometry of the samples fixes the partition unambiguously; each sample's weight is exactly its 'share of the domain'
Identify the structural disadvantage: sensitivity to BOUNDARY definition — samples near the domain edge get polygons that extend to infinity in the unbounded Voronoi diagram, and the clipping choice (clip to [0,1]², extend with guard buffer, use convex hull) materially changes the weights
Connect the construction to known algorithms: Fortune's sweepline algorithm (1987, O(n log n)) and the Bowyer–Watson incremental construction for general n; for n ≈ 50 samples an O(n_pixels · n) raster nearest-neighbour scan is fast, exact for the rasterised area, and easy to implement
Position polygonal vs cell declustering honestly: both are HEURISTICS, neither is theoretically optimal; in practice both are tried and compared (§2.3 / §2.4); cell is the GSLIB default for regular boundaries, polygonal is preferred when the sample geometry is irregular and a Voronoi clipper is available — neither method can fix sampling that missed a sub-population

§2.1 took cell declustering apart in full: discretise the domain into a regular grid, count samples per cell, weight each sample by $w_i = 1/(n_c \cdot N_{\text{occ}})$ , sweep cell sizes, average over grid origins, pick the optimum by an explicit criterion. It is the GSLIB default for good reason — but it has one parameter (the cell size) and one nuisance variable (the grid origin), both of which the user has to manage. §2.2 develops the principled alternative: polygonal declustering, in which the geometry of the sample pattern itself fixes the partition, and the weight of each sample is just the area of its Voronoi polygon (also called a Thiessen polygon in classical hydrology). No cell-size parameter. No grid-origin choice. Each sample's weight equals its "share of the domain". The trade-off is sensitivity to where the domain boundary lives — but that trade-off is at least geometrically interpretable, and edge-effect mitigations are well-developed.

By the end of this section you will know the Voronoi construction, the polygonal-declustering weight formula, the edge-effect problem and the standard fixes (clipping to the domain, guard-zone extension, convex-hull bounding), the computational landscape (Fortune's sweepline algorithm, Bowyer–Watson incremental, raster approximations for small n), and the honest comparison to cell declustering — when each one wins, and why the choice between them is rarely critical for the somewhat-irregular common case. The two §2.2 widgets make all of this directly visible.

The Voronoi / Thiessen construction

Given $n$ sample points ${\mathbf{s}_1, \ldots, \mathbf{s}_n}$ in a 2D domain $D \subset \mathbb{R}^2$ , the Voronoi polygon of sample $i$ is the set

V_i \;=\; \left\{ \mathbf{s} \in D \;:\; \|\mathbf{s} - \mathbf{s}_i\| \,<\, \|\mathbf{s} - \mathbf{s}_j\| \quad \text{for all } j \neq i \right\}.

In plain English: $V_i$ is the set of locations whose nearest sample is $\mathbf{s}_i$ . The boundary between two adjacent polygons $V_i$ and $V_j$ is the perpendicular bisector of the segment $[\mathbf{s}_i, \mathbf{s}_j]$ — every point on that bisector is equidistant from $\mathbf{s}_i$ and $\mathbf{s}_j$ . The polygons (slightly more carefully: their closures, since the strict inequality leaves ties on the bisectors as a measure-zero set) exactly partition the domain: no overlaps, no gaps. In 3D the construction becomes Voronoi polyhedra (or "Voronoi cells" in the literature — confusingly, the same word "cell" used for cell declustering, but here meaning a polygon-area in 2D or a polyhedron-volume in 3D).

The construction has been independently rediscovered several times. The mathematical foundation is Voronoi (1908), working on quadratic forms in number theory. The same partition was used by Thiessen (1911) to estimate areal precipitation averages from rain-gauge stations — each gauge's reading was weighted by the area of its "Thiessen polygon", an early hydrological application that still shows up in textbooks today. In the geostatistical literature the construction is most often called a "Voronoi diagram" (Okabe et al. 2000); in hydrology and meteorology it is more often a "Thiessen polygon". Same object, two names.

The polygonal-declustering weight formula

With the partition in hand, define the polygonal-declustering weight of sample $i$ as the area (3D: volume) of its Voronoi polygon, normalised so weights sum to 1:

w_i \;=\; \frac{A(V_i)}{\sum_{j=1}^{n} A(V_j)}.

Because the polygons partition the domain, $\sum_j A(V_j) = A(D)$ — the total domain area — and the formula simplifies to $w_i = A(V_i) / A(D)$ . The weights sum to 1 by construction (no special normalisation step needed), they are non-negative by construction (areas are non-negative), they form a proper probability mass on the sample exactly like the cell-declustering weights of §2.1.

The declustered mean and declustered empirical CDF are the same formulas as §2.1:

\bar{Z}_d \;=\; \sum_{i=1}^{n} w_i \, Z_i, \qquad \hat{F}_n^{d}(z) \;=\; \sum_{i=1}^{n} w_i \, \mathbf{1}\{Z_i \le z\}.

The geometric interpretation is clean: each sample's weight is its share of the domain — the fraction of the area that it "owns" because every location in that area is closer to it than to any other sample. An isolated sample in a sparsely-sampled corner of the domain owns a large polygon and gets a large weight. A sample crammed into a tight cluster shares its corner of the domain with $\sim n$ neighbours, owns a tiny polygon, and gets a tiny weight. The qualitative behaviour is identical to cell declustering — isolated samples up-weighted, clustered samples down-weighted — but the quantitative split is determined by the geometry directly, not by a free cell-size parameter.

Watching the tessellation

The first widget puts the construction on the canvas. 50 samples are drawn preferentially from the §1.3 / §2.1 synthetic field (the same field, so your intuition transfers), and the Voronoi tessellation is rendered in full. Each polygon is colour-coded by its area = its declustering weight (small dark polygons in the clustered region, large bright polygons in the sparsely-sampled regions). The right-hand bar chart sorts the per-sample weights high-to-low so the reader can see at a glance: a handful of isolated samples carry the large weights; the rest are small.

Three things to try. (1) Drag a sample. Grab any dot with the mouse (or touch on tablet/phone) and slide it across the unit square. The tessellation re-tiles live; nearby polygons grow or shrink as the moved sample claims or releases territory. The bar chart re-sorts on every frame. (2) Toggle the boundary-clipping control. With clipping ON, polygons are clipped to the $[0,1]^2$ domain; the total polygon area equals 1.0 exactly. With clipping OFF, the domain is extended with a "guard buffer" — and the edge polygons claim huge slabs of empty space, inflating their declustering weights without any matching information. The stats panel reports the inflated total area and the boundary-polygon fraction; the verdict line explains what just changed. (3) Resample. Each new realisation gives a different geometry; the largest-weight sample (the most isolated one) sometimes carries 5× the uniform weight, sometimes less than 2×. The polygon geometry is what carries the variance.

Why polygonal declustering is structurally cleaner than cell

The case for polygonal declustering over cell declustering comes down to three structural advantages.

No tuning parameter. Cell declustering has the cell-size knob (and the grid-origin nuisance). The reader has to sweep cell sizes, pick a criterion (min, max, a-priori target, or Bourgault's 1997 average), and explain that choice in the report. Polygonal declustering has no parameter to sweep. The output is uniquely determined by the sample geometry and the domain boundary.
Geometric interpretability. Each sample's weight equals its "share of the domain" — the area it represents — which is the most natural definition of declustering weight if you think of it as a numerical-integration weight $w_i \approx \int_D \mathbf{1}_{V_i}(\mathbf{s}),d\mathbf{s} / A(D)$ . The weight has a one-line explanation that any geologist or engineer can follow: "How much of the deposit does this sample represent?"
Origin invariance. Cell declustering depends on where you anchor the cell grid; the GSLIB standard practice averages over 3–5 random origins to mitigate this. Polygonal declustering has no origin — the partition is intrinsic to the sample points. Re-running the algorithm always produces the same answer.

The trade-off, of course, is that polygonal declustering inherits sensitivity to the domain boundary, in a way that cell declustering does not. The next subsection addresses that.

The edge-effect problem and standard mitigations

The Voronoi polygon $V_i$ is defined relative to a domain $D$ . For an unbounded Voronoi diagram on the infinite plane, samples on the convex hull of the point set have polygons that extend to infinity — the partition is well-defined locally but globally unbounded for those samples. Real geological domains are bounded — by a lease boundary, by a regulatory permit, by a structural pinch-out — so the practical computation always clips each polygon to the domain $D$ . That clipping is the heart of the edge-effect problem.

Consider a sample $\mathbf{s}_i$ sitting right on the $[0,1]^2$ boundary. Its unbounded Voronoi polygon stretches off to infinity in the direction away from the domain interior — that is the unbounded part — and reaches over to its nearest in-domain neighbours on the bounded side. After clipping to $[0,1]^2$ , the polygon is a finite shape that hugs the boundary. The CLIPPED area is the right number for the polygonal-declustering weight. But the UNCLIPPED area is large (potentially infinite), and if the implementation forgets to clip, or clips to a much larger guard buffer, the edge sample gets a wildly inflated weight.

The §2.2-A widget exposes this directly. Toggle "boundary clipping = off" and the domain is extended by a guard buffer; boundary samples claim large slabs of the buffer and their weights are inflated by 2–10×. The total polygon area goes well above 1.0 (because the extended domain has area > 1.0), and the boundary-polygon fraction climbs sharply. In a real declustering computation that uses a Voronoi library, this kind of bug is silent unless someone checks the clipping behaviour explicitly — and the resulting weights look superficially plausible until they propagate downstream.

The standard mitigations are three:

Clip to the domain. The right thing in principle. Implementations: scipy.spatial.Voronoi + Shapely polygon-clipping, CGAL (C++), the R "deldir" package with explicit "rw" boundary, or — for n that's small enough — a rasterised nearest-neighbour scan as the §2.2-A widget uses. Clipping is the textbook answer; getting the clipping CORRECT in a library you did not write is the practical pain.
Extend the domain with a guard zone, then clip. Used when the domain boundary is uncertain or expected to shift after additional drilling. Extend $D$ by a buffer of width $b$ on each side, compute the Voronoi diagram, clip to the EXTENDED domain, and use those weights. The boundary samples get larger polygons than the clip-to- $D$ case (because they own the buffer territory) — which is the right behaviour if you believe the deposit really does continue past your current domain definition.
Reflect samples across the boundary. An older technique borrowed from kernel-density estimation: for each boundary sample, create a "ghost" sample by reflection across the nearest domain edge. Compute the Voronoi diagram of the original-plus-ghost set, then take only the polygons of the original samples. The reflection forces the boundary polygons to terminate at the boundary itself (because the ghost would otherwise be a closer neighbour), eliminating the unbounded-polygon problem cleanly. Modern practice prefers explicit clipping (it's more controllable), but the reflection trick is still used in some implementations.

For very irregular geological domains — convex-hull-of-the-samples boundaries, salt-body excisions, structural pinch-outs — the clipping step is non-trivial. Production geostatistics software (GSLIB-DECLUS-2D, SGeMS, Leapfrog, RMS) has explicit support for declared domain polygons and handles this correctly; ad-hoc Python scripts using scipy.spatial.Voronoi without polygon clipping do not, and can silently produce inflated edge weights.

Computational landscape

Computing the Voronoi diagram of $n$ points in 2D has three well-known algorithms.

Fortune's sweepline algorithm (1987). $O(n \log n)$ time. Sweeps a horizontal line across the plane, maintaining a "beach line" of parabolic arcs that records the locus of points equidistant from already-processed samples and the sweepline. Site events (sweepline crosses a sample) and circle events (a parabolic arc disappears) are processed in priority order. Implemented in essentially every Voronoi library; the de facto standard for production. CGAL and the Python "scipy.spatial.Voronoi" both use Fortune's algorithm under the hood (the latter via Qhull).
Bowyer–Watson incremental algorithm (Bowyer 1981; Watson 1981). $O(n \log n)$ expected time, $O(n^2)$ worst case. Constructs the Delaunay triangulation (which is the dual of the Voronoi diagram) by inserting samples one at a time; each insertion identifies the triangles whose circumcircles contain the new sample, removes them, and re-triangulates the resulting cavity. The Voronoi diagram is then read off as the dual of the Delaunay triangulation: every Delaunay edge between samples $i$ and $j$ corresponds to a shared Voronoi-polygon edge between $V_i$ and $V_j$ .
Naive raster (nearest-neighbour scan). $O(M \cdot n)$ where $M$ is the number of pixels in the raster. For each pixel, find the nearest sample by brute-force scan. Approximates the polygon by the set of pixels it owns; the area of a polygon is its pixel count times the pixel area. Bad as $n$ grows large (production datasets with $n = 10^4$ or more) — but completely fine for the $n = 50$ samples of the §2.2-A widget on a 168×168 raster (~4M ops per recompute, sub-millisecond on a modern browser). Clipping to the domain is automatic: don't scan pixels outside $D$ .

In 3D, Voronoi-polyhedron computation is materially more complex; the standard tool is CGAL's 3D Voronoi diagram, or numerical packages such as scipy.spatial.Voronoi (3D version, with all the same clipping cautions as 2D). For typical geostatistical applications with $n = 100$ to $10^4$ samples, 3D Voronoi is computable in seconds-to-minutes on commodity hardware.

Polygonal vs cell on the same dataset (§2.3 preview)

The second §2.2 widget runs both declustering methods on the same 80-sample realisation and shows the resulting declustered histograms side by side. Left panel: cell declustering at the auto-selected optimum cell size (full GSLIB DECLUS sweep — 4 to 30 cells per side, 5 random origins per size, MIN criterion). Right panel: polygonal declustering with boundary clipping on. Each panel overlays the same true population histogram (grey) and the same biased histogram (red) for direct comparison, with vertical mean lines for each method.

What you will typically see: the two declustered means agree to within 1–3% of each other and both correct most of the 15–25% raw bias. The two HISTOGRAMS look broadly similar; the polygonal version is slightly bumpier because each sample carries a different weight (whereas cell declustering groups samples by cell and they share weights within a cell). Resample a few times — the two methods' answers track each other closely on most realisations, occasionally diverge by 3–5% on realisations where the sample geometry is heavily edge-loaded.

This is the §2.3 lesson in microcosm: both methods work, neither is universally better, the choice is driven by software availability and the regularity of the sample geometry, not by deep theory. §2.3 develops the explicit comparison criteria — when to prefer cell, when to prefer polygonal, how to compare them numerically on real datasets, and what to report when they disagree.

What polygonal declustering can and cannot do

The honest scope mirrors §2.1 with one specific addition.

It cannot fix MISSING populations. Just as for cell declustering: if a sub-population of the deposit was never sampled, no choice of declustering method can recover the missing data. Polygonal weights describe the SAMPLES you have; they do not describe the SUB-POPULATIONS you missed.
It is a HEURISTIC, not a derived estimator. The polygon-area weight is a natural geometric interpretation of "each sample's share of the domain", but it is not the Horvitz–Thompson inverse-inclusion-probability weight that would be theoretically optimal under known sampling probabilities. As with cell declustering, the polygonal weight approximates the inverse inclusion probability by local density (a sample with many neighbours has a small polygon = small weight). The approximation is geometrically interpretable; whether it is a BETTER approximation than the cell-declustering one depends on the sample geometry.
It is sensitive to boundary definition. The principal failure mode is incorrect clipping of edge polygons. The §2.2-A widget makes this visible directly. In real implementations, verifying that polygons are clipped to the geological domain (not to a larger software bounding box or to an unbounded extent) is a critical QC step. Cell declustering does not have this failure mode; this is its structural advantage on regular boundaries.
It loses statistical power at small sample sizes — somewhat less severely than cell declustering, because each sample is guaranteed to have its OWN polygon (no "many cells with one sample" collapse). Below $n \sim 20$ the polygon areas are dominated by a handful of edge samples and the weights are not statistically meaningful; below $n \sim 50$ both methods are equally questionable.
3D is more complex. Voronoi polyhedron computation is doable but materially more expensive than 2D, and most reservoir-modelling codes default to cell declustering in 3D for that reason. If the polygonal weight is required in 3D, CGAL or scipy.spatial.Voronoi can produce it; clipping to a 3D geological volume is the implementation challenge.

The full polygonal-declustering workflow

Putting the pieces together, here is the production recipe for polygonal declustering of a sample set in a bounded 2D domain $D$ :

Define the domain $D$ precisely. Lease boundary, geological domain polygon, structural envelope — whatever is appropriate. The polygonal weights will be sensitive to this choice, so it should reflect the actual support of the population you want to estimate. In production codes this is supplied as a domain-polygon file.
Compute the (unbounded) Voronoi diagram of the samples. Fortune's sweepline algorithm in any standard library: scipy.spatial.Voronoi, CGAL, R's deldir, MATLAB voronoi() — all use it under the hood.
Clip each polygon to $D$ . Polygon-clipping libraries: Shapely (Python), CGAL (C++), R's sf or rgeos packages. For irregular domains, this is the implementation step where bugs are most likely; verify by summing the clipped areas and confirming they equal $A(D)$ .
Compute the area $A(V_i)$ of each clipped polygon. Straightforward from the vertex list using the shoelace formula. Many polygon-handling libraries give it directly.
Compute the weights $w_i = A(V_i) / \sum_j A(V_j)$ — equivalently $w_i = A(V_i) / A(D)$ . Verify $\sum w_i = 1$ to numerical precision; deviation indicates a clipping bug.
Compute the declustered mean and empirical CDF with the same $\sum w_i Z_i$ and $\sum w_i \mathbf{1}{Z_i \leq z}$ formulas as §2.1. These weights feed every downstream geostatistical step exactly as the cell-declustering weights do (N-score transform, variogram pair counts, kriging mean, SGS reference distribution).

For irregular sample geometries — a few large clusters separated by big empty regions, or a hand-mapped geological domain with concavities — polygonal declustering is often preferred because the Voronoi polygons adapt naturally to the local sample density without a free parameter. For roughly regular sample patterns on a roughly rectangular domain, cell declustering is more common because it ships with GSLIB's DECLUS program out of the box. §2.3 lays out the comparison criteria explicitly; §2.4 walks both methods through the downstream chain.

Try it

In the voronoi-explorer widget, drag the most isolated-looking sample (the one with the largest bright polygon) directly into the middle of the densest cluster. Watch its polygon shrink and the cluster polygons rearrange. How does the bar chart respond? Now drag it back out to an isolated corner — its polygon should regrow. The drag interaction makes the geometric interpretation tangible.
Toggle the "boundary clipping" select to "off". The polygon edges no longer terminate at the $[0,1]^2$ boundary — they extend into the guard buffer. The total polygon area climbs above 1.0 (reported in the stats panel). The boundary-polygon fraction (the share of the total area held by polygons that touch the boundary) goes up. Now toggle back to "on" and watch the edge polygons snap back to the [0,1]² boundary. This is the edge-effect lesson in interactive form.
Click "Resample" several times. Note how the most isolated sample's weight varies between realisations — sometimes 4–5× the uniform weight 1/n, sometimes barely above 2×. The polygon geometry is doing all the work; no parameter intervenes.
In the polygonal-vs-cell widget, read the cell-declustered mean and the polygonal-declustered mean. How close are they to the true population mean? How close are they to each other? The verdict line below interprets the agreement. Now click "Re-roll cell origins" several times — does the cell-declustered mean shift? Does the polygonal-declustered mean? (It should not — polygonal declustering has no origin.)
Click "Resample" several times in the polygonal-vs-cell widget. On how many realisations do the two methods agree to within 1 percentage point? On how many do they diverge by more than 2 percentage points? When they diverge, which one is closer to truth? The pattern across realisations is the §2.3 comparison story.
Without coding: a mining geologist has 200 drill-hole samples in an L-shaped lease, with a clear cluster of holes around an old workings and a sparser scatter across the rest of the lease. Which declustering method would you recommend, and why? (Hint: the irregular boundary and the boundary-hugging cluster make this a case where polygonal declustering, with the L-shape supplied as a domain polygon, gives a more geometrically defensible answer than cell declustering with a single rectangular grid. Cell declustering would still work as a cross-check.)
Without coding: a petroleum engineer has 12 well logs in a roughly square reservoir block. The wells are NOT clustered — they are placed on a roughly regular 4×3 grid with one outlier. Should the engineer use cell or polygonal declustering, or skip declustering altogether? (Hint: with 12 wells on a near-regular grid, the inclusion probabilities are approximately uniform, so declustering corrections are small in magnitude either way. At $n = 12$ both methods lose statistical power; either is defensible, the result is approximately equal weights, and the engineer should disclose the small- $n$ caveat in the report regardless of method choice.)

Pause and reflect: the polygonal-declustering weight $w_i = A(V_i) / A(D)$ is the fraction of the domain that is "closest to" sample $i$ . Is "closest to" the right declustering principle? Equivalent question: is the implicit assumption that the value of $Z(\mathbf{s})$ at every point in $V_i$ is "best represented by" $Z(\mathbf{s}_i)$ actually true? It is exactly the assumption that any nearest-neighbour interpolator would make. Where in the geostatistical workflow does that assumption get RELAXED — and what does it get replaced with? (Hint: kriging, in Part 5, replaces "nearest neighbour" with a weighted-average estimator whose weights are derived from the variogram. The declustering question and the interpolation question are not the same — declustering weights describe the sample-design properties; kriging weights describe the field's spatial structure — but they share the same nearest-neighbour intuition at the simplest level.)

What you now know — and where Part 2 goes next

You have the Voronoi / Thiessen construction in precise form: each sample's polygon is the set of locations nearest to it, the polygons partition the domain exactly, and the polygonal-declustering weight is the polygon-area share. You have the structural advantages over cell declustering — no parameter, no origin, intrinsic to the geometry — and the structural disadvantage — sensitivity to the domain boundary, with three standard mitigations (clip to domain, guard-zone extension, sample reflection). You have the computational landscape: Fortune's $O(n \log n)$ sweepline algorithm and the Bowyer–Watson incremental Delaunay construction for general $n$ , and a raster nearest-neighbour scan as a brute-force approximation that is fine for small $n$ . You have the §2.2-A widget for the geometric interpretation (drag samples, toggle clipping) and the §2.2-B widget for the head-to-head against cell declustering (typically agree, occasionally diverge).

You have the honest scope: polygonal declustering is a HEURISTIC, not a derived estimator. It cannot fix missing populations. It loses power below $n \sim 20$ . It is sensitive to boundary definition in a way that cell declustering is not, and the standard fix is to clip the polygons explicitly to the geological domain. In 3D the construction is more expensive but still tractable. Neither cell nor polygonal declustering dominates the other — the choice is software-driven, geometry-driven, and reportable.

§2.3 lays out the comparison criteria head-to-head: when does cell win, when does polygonal win, how do you choose between them on a real dataset, and what do you report when they disagree? §2.4 walks both methods through the downstream chain — declustered N-score, declustered variogram, declustered kriging mean, declustered SGS reference distribution — and shows how the choice between cell and polygonal cascades into the kriged map and the simulation reference distribution. By the end of Part 2 the reader has the full declustering toolbox: motivation (§1.3), cell algorithm (§2.1), polygonal algorithm (§2.2), comparison criteria (§2.3), downstream impact (§2.4). Then Part 3 takes the variogram up — the next workhorse — and the declustering weights you compute here feed straight into the experimental-variogram pair counts under robust estimators.

References

Voronoi, G. (1908). Nouvelles applications des paramètres continus à la théorie des formes quadratiques. Journal für die reine und angewandte Mathematik, 133, 97–178. (The mathematical foundation: Voronoi tessellation introduced in the context of quadratic forms in number theory; the geometric construction is a side product that later named the diagram.)
Thiessen, A.H. (1911). Precipitation averages for large areas. Monthly Weather Review, 39(7), 1082–1089. (The independent rediscovery in hydrology. Thiessen polygons are used to estimate areal mean precipitation from point rain-gauge observations — each gauge's reading weighted by its polygon area. The same construction as Voronoi, applied to a practical problem decades before geostatistics existed as a field.)
Fortune, S. (1987). A sweepline algorithm for Voronoi diagrams. Algorithmica, 2(1), 153–174. (The first $O(n \log n)$ algorithm for Voronoi diagrams. Sweeps a line across the plane, maintaining the beach line of parabolic arcs equidistant from samples and the sweepline. The de facto standard implementation in every modern Voronoi library.)
Okabe, A., Boots, B., Sugihara, K., Chiu, S.N. (2000). Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, 2nd ed. Wiley. (The encyclopedic reference. Covers all the algorithms — Fortune, Bowyer–Watson, divide-and-conquer — plus generalisations (weighted Voronoi, k-th order, network Voronoi) and applications across spatial analysis, statistics, urban planning. The single book to consult for any Voronoi-related question.)
Isaaks, E.H., Srivastava, R.M. (1989). An Introduction to Applied Geostatistics. Oxford University Press. (Chapter 10 introduces cell and polygonal declustering side-by-side as the two textbook alternatives, with the Walker Lake worked example. The §2.1 / §2.2 / §2.3 arc of Part 2 mirrors Isaaks & Srivastava's chapter structure.)
Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford University Press. (§4.1 covers polygonal declustering with the Walker Lake worked example alongside cell declustering. Modern textbook treatment with practical comparisons.)
Deutsch, C.V., Journel, A.G. (1998). GSLIB: Geostatistical Software Library and User's Guide, 2nd ed. Oxford University Press. (DECLUS is the cell-declustering implementation; the GSLIB suite as a whole assumes cell declustering as the default but discusses polygonal declustering as an alternative on irregular domains.)
Cressie, N. (1993). Statistics for Spatial Data, revised ed. Wiley. (§4 covers the spatial-data sampling problem from a mathematical-statistics angle; Cressie discusses Voronoi-based weights as one of the geometric-tessellation approaches to point-pattern correction.)
Pyrcz, M.J., Deutsch, C.V. (2014). Geostatistical Reservoir Modeling, 2nd ed. Oxford University Press. (Reservoir-engineering treatment of declustering, with explicit guidance on when polygonal declustering is preferred over cell declustering and how to set up the domain polygon for production work.)