Models

Large Language Model (LLM)

A transformer-based AI system trained on billions of tokens of text, capable of generating, reasoning about, and transforming language.

Definition

Large Language Models are neural networks (typically Transformer-based) with billions to trillions of parameters, trained on internet-scale text corpora. They predict the next token in a sequence, and through this simple objective on enough data, they acquire broad linguistic and world knowledge.

The "large" matters: smaller models memorise patterns; large models develop emergent capabilities — few-shot learning, reasoning, code generation, translation — without being explicitly trained on those tasks. GPT-3's 175B parameters in 2020 was a breakthrough; current frontier models likely exceed a trillion.

Post-training alignment via RLHF and Constitutional AI converts a raw language model into a helpful assistant. LLMs are now the foundation for most AI applications: search, code generation, document analysis, customer service, and AI agents.

Examples

  • GPT-4 (OpenAI)
  • Claude (Anthropic)
  • Gemini (Google)
  • LLaMA (Meta)
  • Mistral

Want a deeper dive?

Read our full explainer with use cases, how-it-works, and FAQs.

Large Language Models (LLMs) concept guide