EikoNet: from clicked source to travel-time field

Part 7 — Travel-time, surface-wave, and joint inversion

Learning objectives

  • Build EikoNet (Smith-Azizzadenesheli-Ross 2021) end-to-end as a PINN
  • Diagnose and cure the trivial-solution trap of the eikonal residual
  • Race EikoNet against the §7.1 FSM reference and quantify the agreement
  • Recognise where vanilla EikoNet plateaus (~5% relative travel-time error)
  • Set up §7.3, where the factored-eikonal trick brings this below 1%

§7.1 used the Fast Sweeping Method (Zhao 2005) as the reference eikonal solver. This section replaces FSM with a PINN. Following Smith, Azizzadenesheli, Ross (2021), parameterise the travel-time field by a small MLP TNN(x,z;θ)T_{\mathrm{NN}}(x, z; \theta) and train it to satisfy the eikonal equation directly.

The EikoNet recipe

Architecture: 2-32-32-32-1 Tanh MLP. Input: spatial coordinates (x,z)(x, z). Output: scalar travel time TT. The original Smith 2021 paper used 4 hidden layers of 32 units with Sine activations and conditioned on the source location via two extra inputs (x,z,xsrc,zsrc)(x, z, x_{\mathrm{src}}, z_{\mathrm{src}}). This widget uses 3 hidden Tanh layers and a fixed source per training run; the source-conditioning generalisation is described later in this section.

The trivial-solution trap

Naïvely train this MLP against

Lnaive(θ)=1Nck=1Nc(TNN2c21)2+λbcTNN(xsrc)2,\mathcal{L}_{\mathrm{naive}}(\theta) = \frac{1}{N_c} \sum_{k=1}^{N_c} \bigl( |\nabla T_{\mathrm{NN}}|^2 c^2 - 1 \bigr)^2 + \lambda_{\mathrm{bc}} T_{\mathrm{NN}}(x_{\mathrm{src}})^2 ,

and you will discover that with random small-weight initialisation TNN(x;θ0)0T_{\mathrm{NN}}(x; \theta_0) \approx 0 everywhere — and the network NEVER ESCAPES this state. The reason is that T0T \equiv 0 is a stationary point of the loss surface: at T=0T = 0 we have T=0\nabla T = 0, so

Tx(T21/c2)2=4(T21/c2)Tx=0when Tx=0.\frac{\partial}{\partial T_x}\bigl(|\nabla T|^2 - 1/c^2\bigr)^2 = 4 \bigl(|\nabla T|^2 - 1/c^2\bigr) T_x = 0 \quad \text{when } T_x = 0 .

The PDE-residual gradient with respect to network parameters vanishes at T0T \equiv 0. The optimiser sits there. The boundary-condition term TNN(xsrc)2T_{\mathrm{NN}}(x_{\mathrm{src}})^2 is also satisfied at T=0T = 0. Both losses are happy with the constant-zero solution; only one of those is the actual eikonal answer.

Cure: warmstart anchor + smooth blend

Two production-grade fixes are in the literature:

  • Smith-Azizzadenesheli-Ross 2021 use Sine activations with magnitude-aware initialisation (the SIREN trick): even at random init the network produces non-zero gradients almost everywhere, so T0T \equiv 0 is no longer a stationary point of the loss.
  • bin Waheed et al 2021 PINNeik use a warmstart anchor: regress TNNT_{\mathrm{NN}} toward an analytic guide for the first few hundred epochs to give the network a non-trivial T|\nabla T| to start from.

This widget uses the warmstart approach with the straight-ray slowness integral as anchor:

Tanchor(x)=xxsrc12(1/c(xsrc)+1/c(x)).T_{\mathrm{anchor}}(x) = \|x - x_{\mathrm{src}}\| \cdot \tfrac{1}{2}\bigl(1/c(x_{\mathrm{src}}) + 1/c(x)\bigr) .

This is the high-frequency limit of the eikonal solution along a STRAIGHT ray from source to xx, with average slowness sampled at the endpoints. For smooth gradient velocities it is within 5%\sim 5% of the true bent-ray FSM solution and gives the network an excellent starting shape that sits far from the trivial-zero stationary point. The full loss becomes

L(θ)=Lpde(θ)+λbcTNN(xsrc)2+wwarm(ep)1Nck(TNN(xk)Tanchor(xk))2\mathcal{L}(\theta) = \mathcal{L}_{\mathrm{pde}}(\theta) + \lambda_{\mathrm{bc}} T_{\mathrm{NN}}(x_{\mathrm{src}})^2 + w_{\mathrm{warm}}(\mathrm{ep}) \cdot \frac{1}{N_c}\sum_k \bigl(T_{\mathrm{NN}}(x_k) - T_{\mathrm{anchor}}(x_k)\bigr)^2

with wwarmw_{\mathrm{warm}} ramping linearly from 1 to a small floor weight 0.03\sim 0.03 over the first 800 epochs. The floor prevents drift away from the anchor in the late phase, where the eikonal residual on its own admits multiple local minima. Cosine LR decay over the last third of training sharpens the final fit.

Adam optimiser, learning rate 5×1035 \times 10^{-3}, 3000 epochs, 150 collocation points (resampled every 50 epochs). Browser wall-clock: ~25-35 s on a modern laptop.

Try it: race EikoNet vs FSM

Eikonet PinnInteractive figure — enable JavaScript to interact.

The widget runs in two phases:

  • FSM reference (instant, ~2-5 ms on the 81 × 41 grid). Cached for visual comparison.
  • EikoNet training (3000 epochs, ~25-35 s). Live progress bar.

Four panels: T_FSM, T_PINN, |T_PINN − T_FSM| absolute error map, and the loss trace (PDE + BC + warmstart components on log-y).

Expected behaviour:

  • Mean RELATIVE travel-time error 46%\sim 4\text{--}6% of TmaxT_{\max}; peak relative error 2025%\sim 20\text{--}25%.
  • Peak error AT the source neighborhood (the 1/r1/r singularity in T|\nabla T| that a vanilla Tanh MLP cannot represent exactly).
  • Loss trace: PDE residual drops 3-4 orders of magnitude (1.0 → 10410^{-4}); BC term drops to near-zero quickly; warmstart term ramps down to its floor as the eikonal takes over.

This is the HONEST plateau of vanilla EikoNet on a Tanh MLP in a 30-second in-browser budget. Smith et al 2021 reach sub-percent accuracy with Sine activations + ~50,000 epochs on Caltech 4-GPU rigs. To get there in seconds rather than hours, §7.3 introduces the FACTORED EIKONAL trick.

The 1/r source singularity

At the source (xsrc,zsrc)(x_{\mathrm{src}}, z_{\mathrm{src}}), T=0T = 0 and the wavefronts emanate radially: Tr/csrcT \sim r/c_{\mathrm{src}} near the source where r=xxsrcr = |x - x_{\mathrm{src}}|. The gradient T=1/csrc|\nabla T| = 1/c_{\mathrm{src}} is finite, but the SECOND derivatives Txx,TzzT_{xx}, T_{zz} go like 1/r1/r — they DIVERGE as you approach the source.

A vanilla MLP cannot represent a function with diverging second derivatives perfectly. The PINN learns an approximation, but residual error concentrates near the source. The §7.3 factored eikonal trick (Treister-Haber 2016, Fomel-Luo-Zhao 2009) splits TT into a regular part τ(x)\tau(x) times the analytic 1/c² travel-time T0(x)T_0(x) from a homogeneous medium, removing the singularity from the part the network learns.

Why bother with EikoNet when FSM is faster?

Three production-relevant reasons EikoNet wins despite being 10⁴× slower per single solve:

  • Continuous output. EikoNet produces T(x)T(x) at any spatial point — no grid interpolation. For inverse problems where receivers don't coincide with grid cells, this avoids interpolation error. For ray-tracing applications it gives smooth gradients.
  • Source-conditioned generalisation. With (x,z,xsrc,zsrc)(x, z, x_{\mathrm{src}}, z_{\mathrm{src}}) inputs, EikoNet learns a single function T(x;xsrc)T(x; x_{\mathrm{src}}) that handles ANY source position. After training (~hours on 4 GPUs in Smith 2021), querying a new source costs ZERO additional training — just a forward pass. Classical FSM has to re-solve.
  • Differentiable through the source position. For microseismic event location (§7.5), you need T/xsrc\partial T / \partial x_{\mathrm{src}} — the sensitivity of arrival times to the unknown source position. EikoNet gives this via auto-diff. Classical FSM requires finite differences across multiple full solves.

Math of the eikonal-PINN gradient

The auto-diff implementation needs the gradient of the eikonal-residual loss with respect to the MLP parameters. Define

rk=Tx2+Tz21/ck2,r_k = T_x^2 + T_z^2 - 1/c_k^2 ,

then rk/Tx=2Tx\partial r_k / \partial T_x = 2 T_x and rk/Tz=2Tz\partial r_k / \partial T_z = 2 T_z. The chain-rule contribution to the parameter gradient is delivered through the MLP's backward pass with

dy=0,d(grad)=[4rkTx/Nc,  4rkTz/Nc],d(hess)=0.dy = 0, \quad d(\textrm{grad}) = \bigl[ 4 r_k T_x / N_c, \; 4 r_k T_z / N_c \bigr], \quad d(\textrm{hess}) = 0 .

Source BC: Lbc/Tsrc=2λbcTsrc\partial \mathcal{L}{\mathrm{bc}}/\partial T{\mathrm{src}} = 2 \lambda_{\mathrm{bc}} T_{\mathrm{src}}. Warmstart: Lwarm/TNN=2(TNNTanchor)wwarm/Nc\partial \mathcal{L}{\mathrm{warm}}/\partial T{\mathrm{NN}} = 2 (T_{\mathrm{NN}} - T_{\mathrm{anchor}}) w_{\mathrm{warm}}/N_c. These gradients are accumulated, then Adam steps are taken.

What §7.3 will do

§7.3 introduces the FACTORED EIKONAL: write T(x)=T0(x)τ(x)T(x) = T_0(x) \cdot \tau(x) where T0T_0 is the analytic travel time from the source assuming a constant velocity csrcc_{\mathrm{src}}, and τ(x)\tau(x) is a smooth correction that the network learns. The 1/r singularity is absorbed in T0T_0; τ\tau is bounded everywhere. Crucially, this also CURES the trivial-solution trap automatically — there is no τ0\tau \equiv 0 stationary point because the gradient of T0τT_0\tau involves T0τ\nabla T_0 \cdot \tau which is non-zero whenever τ0\tau \ne 0. Result: the source-neighborhood error in this widget collapses by a factor of 5-10×.

References

  • Smith, J.D., Azizzadenesheli, K., Ross, Z.E. (2021). EikoNet: Solving the eikonal equation with deep neural networks. IEEE Trans. Geosci. Remote Sens. 59(12), 10685–10696. The paper this section builds on; uses Sine activations + magnitude-aware init.
  • bin Waheed, U., Haghighat, E., Alkhalifah, T., Song, C., Hao, Q. (2021). PINNeik: Eikonal solution using physics-informed neural networks. Computers & Geosciences 155, 104833. Uses warmstart against analytic anchor — the approach this widget uses.
  • Sitzmann, V., Martel, J.N.P., Bergman, A.W., Lindell, D.B., Wetzstein, G. (2020). Implicit Neural Representations with Periodic Activation Functions. NeurIPS 2020. The SIREN paper Smith 2021's Sine architecture descends from.
  • Grubas, S., Duchkov, A., Loginov, G. (2023). Neural eikonal solver: Improving accuracy of physics-informed neural networks for solving eikonal equation in case of caustics. J. Comput. Phys. 474, 111789. Multi-arrival eikonal extensions.

This page is prerendered for SEO and accessibility. The interactive widgets above hydrate on JavaScript load.