What is Self-Supervised Learning?

Techniques

Self-Supervised Learning

Learning representations from unlabelled data by creating supervisory signals from the data itself.

Definition

Self-supervised learning (SSL) creates labels automatically from the structure of the data itself, eliminating the need for expensive human annotation. The model is trained on a pretext task (predicting masked tokens, predicting image patches, contrastive learning) that doesn't require human labels but forces the model to learn useful representations.

Language model pre-training (next-token prediction, masked language modelling) is a form of SSL. For vision, methods include MAE (Masked Autoencoder), DINO, and SimCLR. CLIP uses contrastive SSL on paired image-text data from the internet.

SSL has proven that representations learned without labels can be remarkably general — transferring to downstream tasks with minimal fine-tuning. It is the foundation of the foundation model paradigm and has largely displaced supervised pre-training for language and is challenging it in vision.

Examples

GPT pre-training (predict next token)
BERT (mask and predict)
CLIP (contrastive image-text)
DINO (self-distillation)

Related Terms

Pre-training

The initial phase of training a large model on massive, general datasets before task-specific fine-tuning.

Foundation Model

A large model trained on broad data that can be adapted to many downstream tasks via fine-tuning or prompting.

Transfer Learning

Reusing knowledge from a model trained on one task to improve learning on a different but related task.

Deep Learning

A subset of machine learning using neural networks with many layers to learn complex hierarchical representations.

Explore

← All glossary terms AI concept guides AI timeline Browse companies