Transfer Learning
Reusing knowledge from a model trained on one task to improve learning on a different but related task.
Definition
Transfer learning leverages the representations learned on a large, general dataset (e.g., ImageNet for vision, Common Crawl for language) to improve performance on a downstream task with limited data. Instead of training from scratch, you initialise with pre-trained weights and fine-tune.
This works because the early layers of deep networks learn general, transferable features (edges, textures, language patterns) while later layers become task-specific. Freezing early layers and fine-tuning later layers often works well when source and target tasks are similar.
Transfer learning is foundational to modern AI: all LLMs are pre-trained then fine-tuned; all vision models start from ImageNet weights. It enables high-performance models with limited labelled data and dramatically reduces compute costs.
Examples
- GPT-4 fine-tuning API
- BERT for text classification
- ResNet for medical imaging
Related Terms
Fine-tuning
Continuing training of a pre-trained model on domain-specific data to specialise it for a particular task.
Pre-training
The initial phase of training a large model on massive, general datasets before task-specific fine-tuning.
Large Language Model (LLM)
A transformer-based AI system trained on billions of tokens of text, capable of generating, reasoning about, and transforming language.
Deep Learning
A subset of machine learning using neural networks with many layers to learn complex hierarchical representations.