Infrastructure

GPU (Graphics Processing Unit)

Specialised processors with thousands of cores enabling the massive parallel computation required for AI training and inference.

Definition

GPUs were designed for rendering graphics but proved transformatively useful for AI due to their architecture: thousands of smaller cores that excel at the matrix multiplications and tensor operations at the heart of deep learning. A modern NVIDIA H100 can perform ~3,000 trillion floating-point operations per second.

The CUDA programming model (NVIDIA) enabled GPUs for general-purpose computing. AlexNet's 2012 GPU-based training was a watershed moment. Today, clusters of hundreds to tens of thousands of GPUs train frontier AI models. Training GPT-4 is estimated to have required ~25,000 A100 GPU-days.

The AI boom has made NVIDIA the world's most valuable semiconductor company. Alternatives are emerging from Google (TPUs), AWS (Trainium/Inferentia), AMD (MI300), and startups like Cerebras, Groq, and Tenstorrent targeting the inference market.

Examples

  • NVIDIA H100/A100 (training)
  • NVIDIA RTX (consumer)
  • Google TPU (inference)
  • Apple Neural Engine (edge)