Generative priors for velocity models
Learning objectives
- Recognise the conceptual jump from AE-as-regulariser to AE-as-sampler
- Train an AE with a latent-space prior penalty so latents concentrate near origin
- Draw new plausible velocity models by sampling z ~ N(0, σ²·I) and decoding
- Identify the bridge to full VAE / diffusion priors used in production
- Set up §9.5 (Bayesian PINNs need a proper likelihood — generative priors deliver one)
§9.3's autoencoder gave us a regulariser: the reconstruction loss acts as a soft membership function on the corpus manifold. §9.4 takes the next step: turn the AE into a SAMPLER that can generate NEW plausible velocity models on demand. This unlocks two things — a proper probabilistic prior for Bayesian inversion (§9.5-§9.6), and continuous parameter exploration along the prior manifold.
The latent-prior trick
A vanilla AE learns to reconstruct its corpus, but the latent codes can sit anywhere in . This means decoding a RANDOM latent for some chosen prior may produce garbage — the random likely falls in a region the decoder was never trained on.
The fix is to train the AE so its latent codes are CONCENTRATED in a known region. Add a penalty during training:
where pushes the latent codes toward the origin. After training, encoded corpus codes have empirical mean and empirical std . Drawing and decoding now gives samples that LOOK LIKE corpus models, because the decoder has seen latent codes in this region throughout training.
This is the simplest non-trivial generative model. Two important refinements lead to production architectures:
- Variational AE (VAE). The encoder outputs instead of a single point . Latent samples are drawn as with (the reparameterisation trick). The KL-divergence replaces the penalty. Result: a proper probabilistic generative model with a likelihood usable in Bayesian inference. Kingma-Welling 2013.
- Diffusion models. Train a network to PREDICT the noise added at each level of a forward Gaussian-noise process. The score function is recovered as a side effect and can be used directly as a regulariser gradient. Far more expressive than VAE for high-dimensional images / 2-D velocity models. Ho-Jain-Abbeel 2020 DDPM, Song-Ermon 2019 score matching.
Try it: generative-AE playground
The widget pretrains a regularised AE () on the same 200-model corpus from §9.3 for 700 epochs. After pretraining:
- Sampling: 100 random latent codes are drawn from where is the empirical std of corpus latents. Each is decoded to a parameter triple. Plotted alongside the corpus to demonstrate that the GENERATED samples follow the same v₂ ≈ 2v₁ trend.
- Latent-space exploration: two sliders (z₁, z₂) let you traverse the latent manifold. Every position decodes INSTANTLY to a velocity profile c(z). Move the sliders — the velocity profile updates continuously. This is the central capability of generative priors: bounded, on-manifold parameter exploration.
Three panels: (v_1, v_2) parameter space with corpus + generated samples + slider position; the latent space with corpus encoded + slider position; the decoded velocity profile c(z) for the current sliders.
Expected behaviour: the generated samples (orange dots) overlap the corpus (cyan) in the plane, demonstrating that the AE-as-sampler reproduces the training distribution. Sliding through latent space traces out smooth, plausible velocity profiles — the prior manifold made interactive.
Why this matters for Bayesian FWI
Bayesian FWI computes a POSTERIOR over velocity models given seismic data:
where is the prior over velocity models. Hand-crafted priors (Tikhonov, total variation) provide closed-form but limited expressivity. Generative priors provide a learned, expressive via the decoder + latent prior:
Sampling from the posterior via MCMC then becomes computationally tractable: each Markov-chain step proposes a new velocity model by perturbing and decoding. The proposal automatically lies on the prior manifold, dramatically improving acceptance rates compared to proposing in the high-dimensional ambient space.
Production examples
- Mosser et al 2020 trained a GAN on Marmousi-style velocity models, then used the generator as the prior for stochastic FWI. Posterior samples were drawn by perturbing the GAN latent code while satisfying the data likelihood.
- Liu et al 2023 use diffusion priors for posterior FWI, leveraging the score function for gradient-based posterior sampling. State of the art on synthetic and real-data tests.
- Asgharzadeh et al 2023 use deep image priors (UNet trained on the data itself, no external corpus) — a degenerate case where the "generative prior" is implicit in the network architecture.
Limitations
- Out-of-distribution velocity models: if the true subsurface contains structures (e.g., salt domes) absent from the training corpus, the generative prior will REJECT them. Production codes use diverse, geographically broad corpora to mitigate.
- Mode collapse: GANs in particular often miss corpus modes. VAEs tend to capture all modes but may be biased toward the centroid. Diffusion models cover modes well but cost more compute.
- Latent-space topology: the latent space is not Euclidean in any meaningful sense. Linear interpolation in z does NOT correspond to linear interpolation in c(x). Geodesics on the latent manifold require Riemannian-metric machinery for proper Bayesian inference (Arvanitidis et al 2018).
What §9.5 will do
§9.5 introduces UNCERTAINTY in the PINN itself: weight-space Bayesian PINNs and ensemble-based UQ. We have priors over velocity models (§9.4); now we add priors over PINN parameters and produce posterior samples. §9.6 combines all of it: uncertainty in data, uncertainty in PINN, generative prior on velocity, posterior FWI inversion.
References
- Kingma, D.P., Welling, M. (2013). Auto-encoding variational Bayes. ICLR 2014. arXiv:1312.6114. The VAE paper.
- Ho, J., Jain, A., Abbeel, P. (2020). Denoising diffusion probabilistic models. NeurIPS 2020. The DDPM paper that brought diffusion models to image generation.
- Song, Y., Ermon, S. (2019). Generative modeling by estimating gradients of the data distribution. NeurIPS 2019. Score-matching foundations of diffusion priors.
- Mosser, L., Dubrule, O., Blunt, M.J. (2020). Stochastic seismic waveform inversion using generative adversarial networks as a geological prior. Math. Geosci. 52, 53–79.
- Liu, Z., Yang, Y., Quan, Y., Yang, Y., Wang, B. (2023). Diffusion-prior-based seismic full waveform inversion. arXiv:2306.10094. Production diffusion-prior FWI.
- Arvanitidis, G., Hansen, L.K., Hauberg, S. (2018). Latent space oddity: on the curvature of deep generative models. ICLR 2018. Riemannian geometry of VAE latent spaces.