GPU (Graphics Processing Unit)
Specialised processors with thousands of cores enabling the massive parallel computation required for AI training and inference.
Definition
GPUs were designed for rendering graphics but proved transformatively useful for AI due to their architecture: thousands of smaller cores that excel at the matrix multiplications and tensor operations at the heart of deep learning. A modern NVIDIA H100 can perform ~3,000 trillion floating-point operations per second.
The CUDA programming model (NVIDIA) enabled GPUs for general-purpose computing. AlexNet's 2012 GPU-based training was a watershed moment. Today, clusters of hundreds to tens of thousands of GPUs train frontier AI models. Training GPT-4 is estimated to have required ~25,000 A100 GPU-days.
The AI boom has made NVIDIA the world's most valuable semiconductor company. Alternatives are emerging from Google (TPUs), AWS (Trainium/Inferentia), AMD (MI300), and startups like Cerebras, Groq, and Tenstorrent targeting the inference market.
Examples
- NVIDIA H100/A100 (training)
- NVIDIA RTX (consumer)
- Google TPU (inference)
- Apple Neural Engine (edge)
Related Terms
Inference
Using a trained AI model to generate predictions or outputs on new input data.
MLOps
The practice of streamlining the ML lifecycle — from experimentation to production deployment and ongoing monitoring.
Edge AI
Running AI inference on local devices (phones, IoT sensors, vehicles) rather than in the cloud.