What is a neuron?

Neural networks from absolute zero

Learning objectives

Read the equation y = σ(w · x + b) and identify each part
Predict how changing the weight, bias, or activation changes the output
Distinguish trainable parameters (w, b) from architecture choices (σ)
Use the widget to verify those predictions hands-on, in both 1-input and 2-input modes

A neural network is, despite its mythology, an extraordinarily simple object. It is built out of one repeated piece — a neuron — in the way that a brick wall is built out of bricks. Every physics-informed neural network we will build in the rest of this textbook is a stack of these. So before we go anywhere else, we strip a single neuron down to its parts and make sure you can read its arithmetic at a glance.

The mathematical model

A neuron takes some numbers in, multiplies each by a weight, adds them up, adds a bias, and runs the result through a nonlinear function. In one line:

y \;=\; \sigma\!\left(w_1 x_1 + w_2 x_2 + \cdots + w_n x_n + b\right)

Each piece does one job:

$x_1, x_2, \ldots, x_n$ are the inputs. For now think of them as plain numbers. Later, in a PINN, they will be the spatial and temporal coordinates $(x, z, t)$ at which the neuron is asked to predict something — the wavefield value, the velocity, the travel-time.
$w_1, w_2, \ldots, w_n$ are the weights. One per input. These are trainable: their values are tuned by the training procedure we will meet in §0.5.
$b$ is the bias. One scalar offset. Also trainable.
$\sigma$ is the activation function. A fixed nonlinear function (Tanh, ReLU, Swish, sin, …). You pick it before training; it does not get tuned.

It helps to think of the weighted sum $z = w \cdot x + b$ as the neuron's score — a linear combination of evidence — and $\sigma(z)$ as the verdict that maps the score onto a useful output range.

Three knobs

A single-input neuron has only three trainable numbers: one weight, one bias, and an activation choice. Despite that, those three knobs already do something interesting.

The weight $w$ controls how steeply the neuron responds to its input. Big $|w|$ means a small change in $x$ produces a big change in $z$ — the activation curve is rotated to be steeper. Small $|w|$ flattens it. Negative $w$ flips the response.
The bias $b$ controls where the neuron switches on. It shifts the activation curve left or right along the $x$ -axis. A positive bias means the neuron starts firing at lower $x$ ; a negative bias delays it. The point $x = -b/w$ is exactly where the neuron's score crosses zero.
The activation $\sigma$ controls the shape of the response. Tanh saturates smoothly at $\pm 1$ . ReLU is identity for positive scores and zero for negative ones. Swish is a smooth approximation of ReLU. A pure sine wave oscillates forever — and you will see in §0.9 why that turns out to matter for physics-informed networks.

Try it

Drag the w slider and watch the activation curve rotate around the bias point. Drag b and watch the entire curve slide left and right. Switch the activation and watch the shape change while the operating point stays at the same $x$ . Notice the dashed orange line: that is the location $x = -b/w$ where the neuron's pre-activation score is exactly zero. To the left of that line, the neuron is "off" or strongly negative; to the right, it is "on" or strongly positive. The bias moves that switching point.

Now switch to 2-input mode. The neuron now has two inputs and two weights. The weighted sum $z = w_1 x_1 + w_2 x_2 + b$ defines a tilted plane over the $(x_1, x_2)$ input space, and the activation function bends that plane into a curved surface, which the colour map renders as a heatmap. Drag $w_1$ and $w_2$ and watch the heatmap rotate; drag $b$ and watch the bands shift. The dashed white line is the decision boundary $z = 0$ : on one side the neuron is positive, on the other it is negative. The yellow dot is the current operating point.

Why this matters for what comes next

A neuron is just three things — a weight vector, a bias, and an activation function. There is nothing else inside it. The mystery of "deep learning" is not that any individual neuron is sophisticated; it is that stacking many simple neurons produces something extraordinarily expressive. That is the topic of §0.2.

For PINNs specifically: every PINN we build, no matter how fancy, is built out of these neurons. The wave-equation-aware loss function we will meet in Part 1 is what trains the weights and biases of a stack of these things to satisfy a partial differential equation. But the building block is unchanged — it is exactly what is in front of you here.

Pause-and-check. Before moving on, can you answer these three questions without looking back? (1) Which parameters of the neuron are trainable, and which are not? (2) If you double the weight $w$ and halve the input $x$ , what happens to the pre-activation score $z$ ? (3) For a 2-input neuron with $w_1 = w_2 = 1$ and $b = 0$ , where in the $(x_1, x_2)$ plane is the decision boundary $z = 0$ ? If any of those is unclear, drag the sliders in the widget until it becomes obvious — then move on to §0.2.

References

Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep Learning, ch. 6 (feedforward networks). MIT Press.
LeCun, Y., Bengio, Y., Hinton, G. (2015). Deep learning. Nature 521, 436–444.
Raissi, M., Perdikaris, P., Karniadakis, G.E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear PDEs. J. Comput. Phys. 378, 686–707.
Karniadakis, G.E., Kevrekidis, I.G., Lu, L., Perdikaris, P., Wang, S., Yang, L. (2021). Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440.