Models

GPT (Generative Pre-trained Transformer)

OpenAI's series of large decoder-only language models that established the paradigm for modern AI assistants.

Definition

GPT (Generative Pre-trained Transformer) is OpenAI's series of language models: GPT-1 (2018), GPT-2 (2019), GPT-3 (2020, 175B params), GPT-4 (2023, multimodal). They use a decoder-only Transformer architecture trained to predict the next token — the same objective as standard language modelling, but at unprecedented scale.

GPT-3 was the first model to demonstrate dramatic emergent capabilities from scale: few-shot learning, code generation, reasoning, and surprisingly general knowledge without task-specific training. GPT-4 added multimodal inputs (images) and dramatically improved reasoning on professional benchmarks (bar exam, SAT).

The GPT architecture — specifically the decoder-only, next-token-prediction paradigm — became the dominant template for all subsequent frontier LLMs (LLaMA, Mistral, Claude, Gemini). ChatGPT, built on GPT-3.5 and GPT-4 with RLHF, reached 100M users in 2 months — the fastest consumer product adoption in history.

Examples

  • ChatGPT
  • GPT-4 API
  • GitHub Copilot (GPT-based)
  • OpenAI Codex