Backpropagation
The algorithm for computing gradients in neural networks by propagating errors backwards through layers.
Definition
Backpropagation (backprop) is the fundamental algorithm for training neural networks. After a forward pass produces a prediction, the error is computed against the true label via a loss function. Backprop applies the chain rule of calculus to compute the gradient of the loss with respect to every weight in the network, working backwards from output to input.
These gradients tell us how much each weight contributed to the error. An optimiser (SGD, Adam, AdaGrad) then updates weights in the direction that reduces the loss. Repeat over many batches of data and the network converges to a useful set of weights.
Backpropagation was popularised by Rumelhart, Hinton, and Williams in 1986 and remains the universal training algorithm for neural networks despite decades of attempts to find alternatives. Efficient automatic differentiation libraries (PyTorch, JAX, TensorFlow) implement backprop automatically.
Examples
- PyTorch autograd
- TensorFlow automatic differentiation
- Training any neural network
Related Terms
Neural Network
A computing system of interconnected nodes inspired by biological brains, trained to recognise patterns.
Deep Learning
A subset of machine learning using neural networks with many layers to learn complex hierarchical representations.
Gradient Descent
An iterative optimisation algorithm that updates model weights by stepping in the direction that most reduces loss.