Techniques

Backpropagation

The algorithm for computing gradients in neural networks by propagating errors backwards through layers.

Definition

Backpropagation (backprop) is the fundamental algorithm for training neural networks. After a forward pass produces a prediction, the error is computed against the true label via a loss function. Backprop applies the chain rule of calculus to compute the gradient of the loss with respect to every weight in the network, working backwards from output to input.

These gradients tell us how much each weight contributed to the error. An optimiser (SGD, Adam, AdaGrad) then updates weights in the direction that reduces the loss. Repeat over many batches of data and the network converges to a useful set of weights.

Backpropagation was popularised by Rumelhart, Hinton, and Williams in 1986 and remains the universal training algorithm for neural networks despite decades of attempts to find alternatives. Efficient automatic differentiation libraries (PyTorch, JAX, TensorFlow) implement backprop automatically.

Examples

  • PyTorch autograd
  • TensorFlow automatic differentiation
  • Training any neural network