Fine-tuning
Continuing training of a pre-trained model on domain-specific data to specialise it for a particular task.
Definition
Fine-tuning takes a model pre-trained on broad data and continues training it on a smaller, task-specific dataset. The pre-trained weights provide a strong initialisation; fine-tuning adapts them to the target domain with far less compute and data than training from scratch.
For LLMs, fine-tuning approaches range from full fine-tuning (updating all weights), to parameter-efficient methods like LoRA (Low-Rank Adaptation) and QLoRA (quantised LoRA), to soft prompting. Full fine-tuning can be expensive for billion-parameter models; LoRA inserts small rank-decomposition matrices and updates only those, dramatically reducing trainable parameters.
Instruction fine-tuning — training on instruction-response pairs — is what converts base LLMs (trained for next-token prediction) into instruction-following assistants. Domain-specific fine-tuning adapts models for medicine, law, code, or a specific company's data and tone.
Examples
- GPT-4 API fine-tuning
- Medical LLM (domain fine-tuning)
- LoRA fine-tuning with Hugging Face
Related Terms
Large Language Model (LLM)
A transformer-based AI system trained on billions of tokens of text, capable of generating, reasoning about, and transforming language.
RLHF (Reinforcement Learning from Human Feedback)
A training technique where human preference ratings guide language model fine-tuning to produce more helpful, harmless outputs.
Pre-training
The initial phase of training a large model on massive, general datasets before task-specific fine-tuning.
LoRA (Low-Rank Adaptation)
A parameter-efficient fine-tuning method that inserts small trainable rank-decomposition matrices into model layers.
Transfer Learning
Reusing knowledge from a model trained on one task to improve learning on a different but related task.