What is Large Language Model (LLM)?

Models

Large Language Model (LLM)

A transformer-based AI system trained on billions of tokens of text, capable of generating, reasoning about, and transforming language.

Definition

Large Language Models are neural networks (typically Transformer-based) with billions to trillions of parameters, trained on internet-scale text corpora. They predict the next token in a sequence, and through this simple objective on enough data, they acquire broad linguistic and world knowledge.

The "large" matters: smaller models memorise patterns; large models develop emergent capabilities — few-shot learning, reasoning, code generation, translation — without being explicitly trained on those tasks. GPT-3's 175B parameters in 2020 was a breakthrough; current frontier models likely exceed a trillion.

Post-training alignment via RLHF and Constitutional AI converts a raw language model into a helpful assistant. LLMs are now the foundation for most AI applications: search, code generation, document analysis, customer service, and AI agents.

Examples

GPT-4 (OpenAI)
Claude (Anthropic)
Gemini (Google)
LLaMA (Meta)
Mistral

Want a deeper dive?

Read our full explainer with use cases, how-it-works, and FAQs.

Large Language Models (LLMs) concept guide

Companies using this

OpenAI

$11.3B raised

Anthropic

$7.6B raised

xAI

$6.0B raised

Thinking Machines Lab

$2.0B raised

Reflection AI

$2.0B raised

Inflection AI

$1.5B raised

Related Terms

Transformer

A neural network architecture using self-attention to process sequences in parallel — the foundation of all modern LLMs.

Generative AI

AI systems that create new content — text, images, audio, video, code — by learning patterns from training data.

Prompt Engineering

The practice of crafting effective text inputs to guide LLMs toward desired outputs.

RLHF (Reinforcement Learning from Human Feedback)

A training technique where human preference ratings guide language model fine-tuning to produce more helpful, harmless outputs.

Context Window

The maximum number of tokens an LLM can process in a single request, determining how much text it can "see" at once.

Hallucination

When an AI model generates plausible-sounding but factually incorrect or fabricated information.

Fine-tuning

Continuing training of a pre-trained model on domain-specific data to specialise it for a particular task.

Explore

← All glossary terms AI concept guides AI timeline Browse companies