GPT (Generative Pre-trained Transformer)
OpenAI's series of large decoder-only language models that established the paradigm for modern AI assistants.
Definition
GPT (Generative Pre-trained Transformer) is OpenAI's series of language models: GPT-1 (2018), GPT-2 (2019), GPT-3 (2020, 175B params), GPT-4 (2023, multimodal). They use a decoder-only Transformer architecture trained to predict the next token — the same objective as standard language modelling, but at unprecedented scale.
GPT-3 was the first model to demonstrate dramatic emergent capabilities from scale: few-shot learning, code generation, reasoning, and surprisingly general knowledge without task-specific training. GPT-4 added multimodal inputs (images) and dramatically improved reasoning on professional benchmarks (bar exam, SAT).
The GPT architecture — specifically the decoder-only, next-token-prediction paradigm — became the dominant template for all subsequent frontier LLMs (LLaMA, Mistral, Claude, Gemini). ChatGPT, built on GPT-3.5 and GPT-4 with RLHF, reached 100M users in 2 months — the fastest consumer product adoption in history.
Examples
- ChatGPT
- GPT-4 API
- GitHub Copilot (GPT-based)
- OpenAI Codex
Companies using this
Related Terms
Large Language Model (LLM)
A transformer-based AI system trained on billions of tokens of text, capable of generating, reasoning about, and transforming language.
Transformer
A neural network architecture using self-attention to process sequences in parallel — the foundation of all modern LLMs.
BERT
Google's 2018 bidirectional encoder model that transformed NLP by learning contextual word representations from unlabelled text.
RLHF (Reinforcement Learning from Human Feedback)
A training technique where human preference ratings guide language model fine-tuning to produce more helpful, harmless outputs.