What is a neuron?
Learning objectives
- Read the equation y = σ(w · x + b) and identify each part
- Predict how changing the weight, bias, or activation changes the output
- Distinguish trainable parameters (w, b) from architecture choices (σ)
- Use the widget to verify those predictions hands-on, in both 1-input and 2-input modes
A neural network is, despite its mythology, an extraordinarily simple object. It is built out of one repeated piece — a neuron — in the way that a brick wall is built out of bricks. Every physics-informed neural network we will build in the rest of this textbook is a stack of these. So before we go anywhere else, we strip a single neuron down to its parts and make sure you can read its arithmetic at a glance.
The mathematical model
A neuron takes some numbers in, multiplies each by a weight, adds them up, adds a bias, and runs the result through a nonlinear function. In one line:
Each piece does one job:
- are the inputs. For now think of them as plain numbers. Later, in a PINN, they will be the spatial and temporal coordinates at which the neuron is asked to predict something — the wavefield value, the velocity, the travel-time.
- are the weights. One per input. These are trainable: their values are tuned by the training procedure we will meet in §0.5.
- is the bias. One scalar offset. Also trainable.
- is the activation function. A fixed nonlinear function (Tanh, ReLU, Swish, sin, …). You pick it before training; it does not get tuned.
It helps to think of the weighted sum as the neuron's score — a linear combination of evidence — and as the verdict that maps the score onto a useful output range.
Three knobs
A single-input neuron has only three trainable numbers: one weight, one bias, and an activation choice. Despite that, those three knobs already do something interesting.
- The weight controls how steeply the neuron responds to its input. Big means a small change in produces a big change in — the activation curve is rotated to be steeper. Small flattens it. Negative flips the response.
- The bias controls where the neuron switches on. It shifts the activation curve left or right along the -axis. A positive bias means the neuron starts firing at lower ; a negative bias delays it. The point is exactly where the neuron's score crosses zero.
- The activation controls the shape of the response. Tanh saturates smoothly at . ReLU is identity for positive scores and zero for negative ones. Swish is a smooth approximation of ReLU. A pure sine wave oscillates forever — and you will see in §0.9 why that turns out to matter for physics-informed networks.
Try it
Drag the w slider and watch the activation curve rotate around the bias point. Drag b and watch the entire curve slide left and right. Switch the activation and watch the shape change while the operating point stays at the same . Notice the dashed orange line: that is the location where the neuron's pre-activation score is exactly zero. To the left of that line, the neuron is "off" or strongly negative; to the right, it is "on" or strongly positive. The bias moves that switching point.
Now switch to 2-input mode. The neuron now has two inputs and two weights. The weighted sum defines a tilted plane over the input space, and the activation function bends that plane into a curved surface, which the colour map renders as a heatmap. Drag and and watch the heatmap rotate; drag and watch the bands shift. The dashed white line is the decision boundary : on one side the neuron is positive, on the other it is negative. The yellow dot is the current operating point.
Why this matters for what comes next
A neuron is just three things — a weight vector, a bias, and an activation function. There is nothing else inside it. The mystery of "deep learning" is not that any individual neuron is sophisticated; it is that stacking many simple neurons produces something extraordinarily expressive. That is the topic of §0.2.
For PINNs specifically: every PINN we build, no matter how fancy, is built out of these neurons. The wave-equation-aware loss function we will meet in Part 1 is what trains the weights and biases of a stack of these things to satisfy a partial differential equation. But the building block is unchanged — it is exactly what is in front of you here.
Pause-and-check. Before moving on, can you answer these three questions without looking back? (1) Which parameters of the neuron are trainable, and which are not? (2) If you double the weight and halve the input , what happens to the pre-activation score ? (3) For a 2-input neuron with and , where in the plane is the decision boundary ? If any of those is unclear, drag the sliders in the widget until it becomes obvious — then move on to §0.2.
References
- Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep Learning, ch. 6 (feedforward networks). MIT Press.
- LeCun, Y., Bengio, Y., Hinton, G. (2015). Deep learning. Nature 521, 436–444.
- Raissi, M., Perdikaris, P., Karniadakis, G.E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear PDEs. J. Comput. Phys. 378, 686–707.
- Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L. (2021). Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440.