Normal-score transform

Part 1 — Univariate statistics for geo-data

Learning objectives

State the normal-score transform Y = Φ⁻¹(F_n(Z)) and explain why it produces a marginal distribution that is exactly N(0,1) by construction (up to plotting-position discretisation)
Apply the recipe in practice: sort, assign plotting positions p_i = (i − ½)/n, and N-score each value with the standard-normal inverse CDF
Recognise the transform as the empirical, distribution-free generalisation of the parametric log-transform — exactly equivalent for exactly-lognormal data, more general otherwise
Identify when you must back-transform (kriged maps, simulation realisations), via Z = F_n⁻¹(Φ(Ŷ)), and know that the back-transformed kriging VARIANCE needs more care than the back-transformed kriging MEAN
List the standard pitfalls: stationarity assumption (a global F_n only works if Z is stationary), beyond-the-data tail extension for simulation, and biased sample histograms (motivating §1.3 declustering)

§1.1 gave you the toolkit to see whether a marginal distribution is consistent with a reference family. Most earth-science variables flunk that test against Normal: porosity is right-skewed, permeability is heavy-tailed lognormal, ore grade is wildly non-Gaussian. That is a problem, because the geostatistics machinery downstream — kriging variance interpretation in Part 5, sequential Gaussian simulation in Part 7, multi-Gaussian conditional distributions throughout — leans on a Gaussian marginal in one place or another. §1.2 closes the gap: it gives you the first construction in Part 1, an explicit recipe that takes any continuous-margin sample and rewrites it into a sample whose marginal is, by construction, standard normal. Everything spatial follows in the transformed space; the back-transform at the end returns to physical units.

The transform is the workhorse of practical geostatistics. Every commercial geostatistics package — GSLIB, SGeMS, GoCad/Skua, Petrel — runs N-score (sometimes called "Gaussian anamorphosis", sometimes "Hermite-polynomial anamorphosis" in the smoothed parametric variant) as a routine preprocessing step before simulation. §1.2 is procedural — the full theoretical justification ("because the multi-Gaussian framework underwriting Part 5 and Part 7 requires it") arrives only in those later parts. For now, the deal is: trust that downstream geostatistics will want Gaussianised data; learn the recipe and its honest limits here.

The recipe

You have a sample $Z_1, \ldots, Z_n$ from a continuous random variable $Z$ with empirical CDF $F_n$ . Recall from §1.1 that $F_n(z) = (1/n) \sum_i \mathbf{1}{Z_i \le z}$ is a non-decreasing step function rising by $1/n$ at each observed value. The normal-score transform of $Z_i$ is

Y_i \;=\; \Phi^{-1}\bigl(F_n(Z_i)\bigr),

where $\Phi^{-1}$ is the inverse CDF of the standard normal distribution. The two-step intuition is direct: the inner map $F_n$ rewrites $Z_i$ as the percentile it occupies in the sample (so the smallest value gets a percentile near 0, the median gets 0.5, the largest gets a percentile near 1); the outer map $\Phi^{-1}$ takes that percentile back to the value of a standard normal at the same percentile. The composition is "look up your sample percentile, then read off the corresponding standard-normal score." A value that sits at the 84th percentile in your sample comes out as $\Phi^{-1}(0.84) \approx 1$ ; the median comes out as 0; the 16th percentile comes out as $-1$ . The new sample has, by construction, standard-normal margins.

In practice there are ties, edge cases, and finite-sample bookkeeping. The standard practical algorithm is:

Sort the sample: $Z_{(1)} \le Z_{(2)} \le \cdots \le Z_{(n)}$ .
Assign plotting-position probabilities. The most common convention is the Hazen plotting positions $p_i = (i - \tfrac{1}{2})/n$ ; the Weibull variant $p_i = i/(n+1)$ is also seen. Both pull the extremes slightly inward so $p_1 > 0$ and $p_n < 1$ — that is non-negotiable because $\Phi^{-1}(0) = -\infty$ and $\Phi^{-1}(1) = +\infty$ would be unplottable.
Set the N-score of the $i$ -th order statistic to $y_i = \Phi^{-1}(p_i)$ . Map each original observation back through its rank to obtain ${Y_i}$ in the original order.

That is the entire algorithm. The first widget below shows it in action.

The widget puts the three §1.1 presets through the transform side by side. On the left you see the raw histogram of $Z$ and the raw Q-Q plot against Normal — visibly curved for porosity (banana right-skew), kinked for the bimodal facies, and lifted off the line by the outlier cluster for permeability. On the right you see the histogram of $Y$ — closely matching the dashed standard-normal density overlay — and the Q-Q plot of $Y$ vs Normal, which is straight by construction. The sample skewness of $Y$ is near zero and the excess kurtosis is near zero, again by construction. Click any point on the right Q-Q to see the back-transform: which raw $Z$ value does that $Y$ correspond to? The answer is just $Z = F_n^{-1}(\Phi(Y))$ — a lookup in the sorted sample, no extra machinery.

The transform as a monotonic curve

Every transform is a function, and every function can be plotted. The N-score transform is a non-decreasing step function from raw values to N-scores. For small $n$ the steps are visible; for large $n$ the steps merge into a smooth-looking curve. The second widget plots the curve directly.

Slide $n$ from 50 to 500 to feel that the empirical N-score curve is a sampled approximation to an underlying smooth transform. At $n = 50$ each step covers a noticeable $1/n = 2%$ slice of the cumulative distribution; at $n = 500$ the steps are sub-pixel. The toggle "Overlay log-transform" superimposes the parametric Gaussianiser $Y_{\text{log}} = (\log Z - \hat{\mu}$ , centered and scaled so the two curves are directly comparable. For the lognormal-porosity preset the two curves overlap almost exactly — log-transform IS the right Gaussianiser for exactly-lognormal data, and the empirical N-score recovers it. For the bimodal-facies preset the log overlay is wildly off in the middle, because no monotone parametric function can adapt to the between-mode density gap. For the heavy-tail-perm preset the log overlay fits the bulk but misses the outliers; the N-score recipe handles them by just assigning them the 5 largest standard-normal quantiles. Log-transform is a parametric special case; N-score is the empirical, distribution-free generalisation. When in doubt, use N-score.

The back-transform

After geostatistics in $Y$ -space — kriging estimates, simulation realisations, conditional simulations — you almost always want to return to physical units. The back-transform is the formal inverse:

Z = F_n^{-1}\bigl(\Phi(Y)\bigr).

For a single point estimate $\hat{Y}(\mathbf{s}_0)$ this is a lookup. Compute $\Phi(\hat{Y})$ , find that probability in the empirical CDF of the original data, read off the matching raw value. The result $\hat{Z}(\mathbf{s}_0) = F_n^{-1}(\Phi(\hat{Y}))$ is in physical units of the original variable.

Two warnings live here. First, the back-transformed kriging variance is not the inverse map of the kriging variance. The kriging variance you compute in $Y$ -space is symmetric in $Y$ — Gaussian, mean zero, variance $\sigma_K^2(\mathbf{s}_0)$ . When you push that distribution through the non-linear map $Z = F_n^{-1}(\Phi(\cdot))$ , the resulting distribution in $Z$ -space is generally not Gaussian and not symmetric. For lognormal data there is an analytic correction (lognormal kriging, Part 5); for general N-score-transformed data, the standard recipe is to integrate the conditional Gaussian distribution back through the empirical anamorphosis, which is an integration over the support of $F_n$ rather than a closed form. Part 5 walks through this. For now: never report the back-transformed mean of the kriging estimate as the answer, then quote the kriging variance directly in $Z$ -units. They are not commensurable.

Second, simulation can produce $Y$ -values outside the empirical sample range. Sequential Gaussian simulation (Part 7) draws $Y$ from a Gaussian conditional distribution at each grid node, and Gaussians have infinite support; nothing prevents the simulator from drawing a $Y$ value larger than the largest $Y$ in your training sample. Back-transforming that with the empirical inverse $F_n^{-1}$ is undefined — you have asked for a percentile beyond your data. The standard fix is tail extension: extrapolate $F_n$ beyond the observed extremes with a parametric tail (power-law, hyperbolic, or upper-bound truncation), so the back-transform stays defined. GSLIB and most simulation packages provide a choice of extrapolators; the conservative choice is "no extrapolation beyond the minimum/maximum of the data" but it is rarely realistic for simulation studies that ask "what is the probability of seeing a 99th-percentile contaminant concentration somewhere in the field?"

What the transform does — and does not — change

The N-score transform is a monotone, deterministic function of the rank of $Z_i$ in the sample. It does not use the spatial coordinates. As a consequence, it makes the marginal distribution exactly N(0,1) but does not directly change the spatial correlation structure — there is still a sample variogram, still a sill, still a range. What it does change is how those quantities are computed: the variogram of $Y$ is different from the variogram of $Z$ because the values themselves are non-linearly rescaled. A pair of low-permeability neighbours ( $z = 0.1$ mD, $z = 0.15$ mD) might map to ( $y = -2.0$ , $y = -1.8$ ), preserving the rank order but compressing the absolute difference. A pair of outliers ( $z = 70$ mD, $z = 75$ mD) maps to ( $y = 2.5$ , $y = 2.6$ ) — the squared difference $(z_i - z_j)^2 = 25$ becomes $(y_i - y_j)^2 \approx 0.01$ . The N-score variogram is dramatically less outlier-sensitive than the raw-data variogram. Many real-world geostatistical pipelines therefore compute and model the variogram in $Y$ -space and use it for kriging in $Y$ -space, taking the back-transform only at the end.

What N-score does not fix:

Non-stationarity in $Z$ . The transform uses a single, global $F_n$ . That is justified only if the marginal distribution of $Z$ is the same across the field — that is, if $Z$ is stationary. If your data has a regional trend (porosity increases east-to-west, for example) the global $F_n$ is a mixture of locally different distributions, the N-score "Gaussianisation" averages over the trend, and the trend is still present in $Y$ -space, just rescaled. Part 5 (universal kriging / kriging with external drift) handles trends explicitly. §1.5 (local statistics) is where you start to spot stationarity violations.
Sample-clustering bias. If your sample is preferentially located in high-grade zones — which is the rule, not the exception, in mineral exploration and oil & gas — then $F_n$ is biased toward high values. The transform faithfully Gaussianises the biased histogram, leaving you with Gaussian-margin $Y$ data that still encodes the sampling bias. §1.3 (sample-histogram bias) and Part 2 (declustering) are the fix. The right workflow is decluster first, N-score second.
Multimodality with physical meaning. The bimodal-facies preset has two genuinely different rock types. N-score erases their joint two-mode marginal in favour of a single-mode Gaussian. This is the correct preprocessing if you are about to do univariate geostatistics and treat the variable as one population; it is the wrong preprocessing if you wanted to model the two facies separately (indicator kriging in Part 5, indicator simulation in Part 7). The transform never warns you that a two-mode dataset has been smoothed away.

Try it

In the before/after explorer, pick the Porosity (lognormal) preset. Compare the raw histogram's skewness to the N-scored histogram's skewness in the stats panel. The first is large and positive; the second is near zero. Verify the raw Q-Q is a curved banana while the N-scored Q-Q is straight. What did the transform actually do to the data?
Still in the explorer, switch to Perm + outliers. The raw histogram has five values near 80 mD, far above the bulk. After the transform, where do these five values land in the $Y$ histogram? Why are they no longer "outliers" in any meaningful sense in $Y$ -space — and what does this mean for the $Y$ -space variogram?
In the mapping curve widget, set the preset to Porosity. Toggle the log-transform overlay. How closely does the parametric log curve match the empirical N-score curve? Switch to Bimodal facies. How closely does the log curve match now? Explain the difference: what feature of the lognormal preset makes log-transform a good fit, and what feature of the bimodal preset breaks it?
In the mapping curve widget, slide $n$ down to 50. Step-like discretisation appears in the curve. What does each step physically represent? If you had only $n = 10$ samples — a realistic count for an exploration drillout — how trustworthy would the empirical N-score transform be, and what would you do instead?
Without coding: a reservoir engineer kriges porosity in $Y$ -space (after N-score) and reports an estimate $\hat{Y}(\mathbf{s}_0) = 1.5$ at a query location, with kriging variance $\sigma_K^2 = 0.4$ . They then back-transform the estimate via $\hat{Z} = F_n^{-1}(\Phi(1.5))$ and also report a 95% CI as $\hat{Z} \pm 1.96 \sqrt{\sigma_K^2}$ in $Y$ -units, then back-transformed. What is wrong with the CI computation?

Pause and reflect: the N-score transform is information-preserving in the sense that the inverse map exists and recovers $Z$ exactly. Yet shape information (skewness, kurtosis, bimodality) clearly looks different after the transform. Where did that information go — into the inverse map $F_n^{-1}$ itself, into the joint distribution of the $Y$ values at different locations, or both? (Hint: think about what differs between a pair of nearby $Z$ values and the corresponding pair of $Y$ values.)

What you now know

You have the N-score transform recipe: sort the data, assign plotting positions $p_i = (i - \tfrac{1}{2})/n$ , map through $\Phi^{-1}$ , get a sample whose marginal is standard normal by construction. You have the back-transform $Z = F_n^{-1}(\Phi(Y))$ as a lookup, and the warnings that the back-transform of the kriging variance is not as straightforward and that simulation outside the data range demands tail extension. You have the framing of the log-transform as the parametric special case of N-score for exactly-lognormal data — and the visual demonstration that N-score is the empirical, distribution-free generalisation. You have the honest scope: N-score works on marginals, not on spatial structure; it assumes stationarity; it propagates any sampling bias in $F_n$ ; and it erases physically meaningful multimodality.

§1.3 takes up the sampling-bias question head on: your $F_n$ is not the population CDF if your samples are clustered in high-grade zones, and N-scoring a biased $F_n$ does not fix the bias. §1.4 then makes the descriptive statistics robust against the heavy tails and outliers that §1.1 taught you to spot. §1.5 moves from global summaries to local ones, preparing the way for the variogram in Part 3 and kriging in Part 5 — both of which will lean on this section's transform repeatedly.

References

Verly, G. (1983). The multigaussian approach and its applications to the estimation of local reserves. Journal of the International Association for Mathematical Geology, 15(2), 259–286. (The foundational multi-Gaussian framework for which N-score is the standard preprocessor.)
Deutsch, C.V., Journel, A.G. (1998). GSLIB: Geostatistical Software Library and User's Guide (2nd ed.). Oxford University Press. (The nscore program is the canonical implementation, including tail-extension options; pages 223–235.)
Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. Oxford University Press. (§7.4 on the multi-Gaussian approach; §7.4.1 on the normal-score transform and Hermite-polynomial anamorphosis.)
Pyrcz, M.J., Deutsch, C.V. (2014). Geostatistical Reservoir Modeling (2nd ed.). Oxford University Press. (Modern reservoir-engineering treatment of the workflow: decluster → N-score → variogram → kriging/simulation → back-transform.)
Saito, H., Goovaerts, P. (2000). Geostatistical interpolation of positively skewed and censored data in a dioxin-contaminated site. Environmental Science & Technology, 34(19), 4228–4234. (Worked applied example showing N-score, lognormal kriging, and the bias correction needed when reporting back-transformed estimates.)