Federated Learning
Training ML models across decentralised devices without sharing raw data, preserving privacy.
Definition
Federated learning trains models on data that never leaves its source device or silo. Instead of aggregating raw data centrally, each participant trains a local model update on their data and sends only the model gradients or weights to a central aggregator, which combines updates and distributes the improved global model.
This enables training on sensitive data (medical records, financial transactions, keyboard inputs) without centralising it, satisfying GDPR and HIPAA requirements. Google uses federated learning to improve Gboard (keyboard predictions) on Android devices without reading user keystrokes.
Challenges include non-IID data across clients (different usage patterns), communication costs (transmitting large gradients), and defence against adversarial participants (poisoning attacks). Differential privacy is often combined with federated learning for formal privacy guarantees.
Examples
- Google Gboard next-word prediction
- Apple Siri improvements
- Healthcare consortium model training