Diffusion Model
A generative model that learns to create data by reversing a gradual noise-addition process.
Definition
Diffusion models are a class of generative models that learn to reverse a forward noising process. During training, real data is progressively corrupted with Gaussian noise over many steps until it becomes pure noise; the model learns to reverse this process — predicting and removing noise at each step.
At inference, the model starts with random noise and iteratively denoises it, guided by a text prompt (via cross-attention to a text encoder like CLIP). This produces highly coherent images, audio, or video matching the prompt. Stable Diffusion, DALL-E 3, and Midjourney all use diffusion models.
Diffusion models have largely replaced GANs for image generation due to higher quality, training stability, and better text-image alignment. They now extend to video (Sora), audio (AudioLDM), and 3D generation.
Examples
- Stable Diffusion
- DALL-E 3
- Midjourney
- Sora
- Adobe Firefly
Companies using this
Related Terms
Generative AI
AI systems that create new content — text, images, audio, video, code — by learning patterns from training data.
GAN (Generative Adversarial Network)
A generative model architecture where two networks — a generator and discriminator — compete to produce realistic synthetic data.
Large Language Model (LLM)
A transformer-based AI system trained on billions of tokens of text, capable of generating, reasoning about, and transforming language.